We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor. We ask that you disable ad blocking while on Silicon
Investor in the best interests of our community. If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Japan Tests Silicon for Exascale Computing in 2021
Fujitsu and RIKEN have dropped the SPARC processor in favor of an Arm design chip scaled up for supercomputer performance By John Boyd Photo: FujitsuHow Supercomputing Can Survive Beyond Moore's Law
Japan’s computer giant Fujitsu and RIKEN, the country’s largest research institute, have begun field-testing a prototype CPU for a next-generation supercomputer they believe will take the country back to the leading position in global rankings of supercomputer might.
The next-generation machine, dubbed the Post-K supercomputer, follows the two collaborators’ development of the 8 petaflops K supercomputer that commenced operations for RIKEN in 2012, and which has since been upgraded to 11 petaflops in application processing speed.
Now the aim is to “create the world’s highest performing supercomputer,” with “up to one hundred times the application execution performance of the K computer,” Fujitsu declared in a press release on 21 June. The plan is to install the souped-up machine at the government-affiliated RIKEN around 2021.
If the partners achieve those execution speeds, that would place the Post-K machine in exascale territory (one exaflops being a billion billion floating point operations a second). To do this, they have replaced the SPARC64 VIIIfx CPU powering the K computer with the Arm8A-SVE (Scalable Vector Extension) 512-bit architecture that’s been enhanced for supercomputer use, and which both Fujitsu and RIKEN had a hand in developing.
The new design runs on CPUs with 48 cores plus 2 assistant cores for the computational nodes, and with 48 cores plus 4 assistant cores for the I/O and computational nodes. The system structure uses 1 CPU per node, and 384 nodes make up one rack.
For strategic reasons, neither Fujitsu nor RIKEN will reveal how many nodes they are targeting with the Post-K. However, Satoshi Matsuoka, director of the RIKEN Center for Computational Sciences in Kobe, says, “It will be the largest Arm system in the world and in fact, likely the largest supercomputer in the world.”
For system interconnection, Fujitsu is employing its Tofu 6D Mesh/Torustopology originally created for the K computer.
Besides the adoption of a new CPU, several other key technologies are behind the Post-K’s ramp up in execution speed, says Matsuoka. Memory bandwidth has been increased by “more than an order of magnitude,” and network bandwidth has also significantly increased.
In addition, Fujitsu has enhanced the double-precision arithmetic performance of that found on the K computer. And to increase application versatility, it has also added support for half-precision floating point arithmetic that reduces memory loads in applications like AI, where lower precision is acceptable, explains Koji Uchikawa in Fujitsu’s Business Strategy and Development Division.
As well as adopting the Arm instruction set architecture, Fujitsu worked with Arm Limited, the Cambridgeshire, U.K.-based company that develops and licenses Arm technology, to implement new instructions for the scalable vector extension.
Moreover, Fujitsu has developed its own microarchitecture for the chip. Whereas a processor’s instruction set architecture interfaces between the hardware and software to provide instructions to the processor, it does not define the chip’s internal structure. Rather, that is the job of the microarchitecture, and because it directly impacts the processor’s performance, Fujitsu believes this will be an important differentiating factor in its favor.
RIKEN and Fujitsu see several other advantages in adopting the new architecture, not least the design’s inherent power-saving features such as power knobs that dial down the power in certain elements of the CPU when they are not needed. Consequently, Fujitsu is claiming a power consumption of just 30 to 40 megawatts compared to the K computer’s 12.7 MW—despite the Post-K’s target of delivering up to a hundred-fold increase in application processing speed.
Both Fujitsu and RIKEN say they also intend to leverage Arm’s large software ecosystem. “We, Fujitsu, and other collaborators will drive the Arm ecosystem in the high-end server space,” says RIKEN’s Matsuoka. This, he adds, will help contribute to any commercial success Fujitsu has “in selling not only their systems but also the chip to external companies.”
At the same time, Fujitsu “will provide a compatible performance balance with the K computer so that current applications can be migrated after recompiling,” says Uchikawa.
But the supercomputer race is nothing if not a game of hopscotch.
So when the Post-K comes online around 2021, it will find no shortage of competitors vying for the leading position. Nevertheless, RIKEN’s Matsuoka brushes aside such comparisons. “Catalog flops is not our concern. For most applications, Post-K will likely exhibit the fastest time-to-solution and utmost scalability due to its brilliant memory and network bandwidths, as well as an outstanding power-efficient design.”
No doubt it won’t be long before competitors beg to differ.
At VLSI 2018, Samsung unveiled its forthcoming 7nm FinFET platform technology, writes Gary Dagastine.
It is said to be the first mainstream semiconductor manufacturing technology to use extreme ultraviolet (EUV) lithography for single-patterning of middle- and back-end-of-the line features – EUV is expected to offer better pattern uniformity and cost advantages versus standard multiple-patterning approaches for extreme scaling.
EUVThe company used EUV, along with additional front-end scaling, special design constructs and a single diffusion break, to build transistors with a fin pitch and contacted polysilicon pitch (CPP) of 27nm and 54nm, respectively, demonstrating a 50-60% reduction in power requirements versus the company’s 10nm technology.
Samsung EUV lithography demonstrated better fidelity to the mask pattern than traditional argon fluoride (ArF) lithography
Samsung built a 256Mbit SRAM array with a cell size of just 0.0262µm2from these devices, along with CPU/GPU logic circuits, all of which met NBTI reliability requirements.
Separately, the company detailed a low-power 8nm FinFET logic technology for mobile, high-performance and low-power applications. An extension of its existing 10LPP (“Low Power Plus”) 10nm process in volume production, the new 8LPP process consumes 7% less power, is some 15% smaller in area, and can operate at 0.35V. These improvements are due to continued scaling of gate lengths and fin profiles, along with reduced contact resistance and a better sub-threshold leakage profile.
As if two new advanced manufacturing processes weren’t enough, Samsung took the wraps off yet another, which it claims offers the most competitive device performance and aggressive gate pitch (78nm) in the foundry business at technology nodes larger than 10nm. The low-power, high-performance 11LPP platform technology is for mobile and GPU applications. It adopts and enhances features from the company’s existing 14nm and 10nm technologies and incorporates updated design rules. In a ring oscillator test circuit, device performance was shown to be 25% better than Samsung’s 1st generation 14nm finFET technology. (Or if performance was held the same, then power consumption was 42% less.) Globalfoundries weighs-in
Meanwhile, Globalfoundries unveiled a 12nm FinFET technology for low-power, high-performance applications. An evolution of its 14nm 14LPP technology, the new 12LPP process reduces area and power requirements by 10% and 16%, respectively, or it can deliver a 15% performance improvement at a given leakage, all with comparable reliability and yield. In addition, SRAMs built with the new process benefit from a 30% leakage reduction.
Separately, Globalfoundries posed and answered the question, how can the use of copper interconnect be extended as scaling proceeds? The problem is that resistance in copper lines increases as line-widths shrink because of electron scattering at grain boundaries, surfaces and interfaces. Company researchers described a nanosecond flash anneal process that increases copper grain sizes, thereby removing many of these boundaries and interfaces, and providing a better interface between the copper and barrier layers, thus a less resistive path for electrons.
Copper interconnect before (a) and after (b) Globalfoundries’ nanosecond laser anneal. Left in each image is copper grain orientation Middle is copper grain phase Right is transmission electron microscope photo All show that copper lines have larger grains and fewer troublesome grain boundaries and better interfaces after the anneal.
The result is a 35% reduction in resistance, giving a 15% improvement in RC delays and a gain in on-current (IDsat) of 2–5%. In addition, breakdown voltage and copper reliability were enhanced. However, the company said the process, while useful for the most critical metal levels, may not be economical for all levels.
A step toward practical quantum electronicsCEA-Leti led the conference into an entirely different realm: the world of quantum electronics. Manipulating the “spin” of electrons and holes in silicon holds promise as the basis for quantum computing, which one day may be used to solve problems beyond the reach of today’s supercomputers. However, in order to make quantum computing practical and to be able to scale it up in manufacturing, a way must be found to control electron/hole spin electrically, and not with complex hybrid co-integrated micromagnet/electrical schemes.
As a step toward doing this, CEA-Leti disclosed it has experimentally demonstrated the first electric field-mediated control of the spin of electrons in silicon, using a SOI quantum dot device fabricated in a standard CMOS process flow. The underlying control mechanism is based on a complex interplay between spin-orbit coupling and the conduction band of silicon, enhanced by the device’s geometry.
MRAMs, and More MRAMsAs the industry moves to smaller nodes, the scaling of SRAM memory for embedded last-level-cache (LLC) applications becomes challenging and the search is on for alternatives. Spin-transfer-torque magnetic random access memory (STT-MRAM) has shown great promise given its speed, endurance and suitability for back-end integration, but its relatively high operating voltage is a concern.
However, TDK-Headway researchers detailed at VLSI 2018 the first STT-MRAM capable of low-voltage, low-power operation. By cleverly engineering the tunnel barrier in 30nm devices, they achieved writing voltages as low as 0.17V (for a 1ppm error rate), 20ns writing operation and 35µA writing current. They say further improvements are possible.
Samsung described an embedded STT-MRAM built in a 28 FD-SOI process. It operates across the full industrial temperature range (-40ºC to 125ºC) and exhibits high endurance and more than 10 years of data retention. Memory cell operation is said to be robust in solder reflow conditions and under external magnetic disturbance.
Globalfoundries, with an eye toward automotive applications, unveiled a fully functional embedded MRAM (eMRAM) integrated into a 22-nm FD-SOI CMOS process flow. It demonstrated a bit error rate of less than <1 ppm after five solder reflows, met the automotive grade-1 data retention requirement, and demonstrated high intrinsic stand-by magnetic immunity. The researchers say these results indicate eMRAM is capable of serving a broad spectrum of eflash applications at 22 nm or beyond.
TSMC unveiled a 16Mb embedded MRAM reference and sensing circuit in a 40nm CMOS logic process that addresses a key problem with MRAM devices: a small read window, arising from MRAM’s small tunnel-magneto-resistance ratio. TSMC addressed this problem with a hybrid-resistance-reference and novel cell configuration. The measured results show better than 1µA sense resolution and a speedy 17.5ns access time from -40ºC to 125°C.
Autonomous mini-dronesFinally, autonomous, miniature drones are on their way. A Massachusetts Institute of Technology team described a chip they call Navion, which enables autonomous navigation of miniaturized robots such as nano drones.
Said to be the first fully integrated visual-inertial odometry (VIO) chip, Navion is an asic fabricated in 65nm CMOS and co-designed in tandem with the algorithms which run on it. It uses inertial measurements and mono-stereo images to estimate a drone’s trajectory, as well as to generate a 3D map of its environment. On-chip integration of these functions reduces energy and footprint, and eliminates costly off-chip processing and storage. It can process 752×480 stereo images at up to 171fps and inertial measurements at up to 52kHz, and is configurable to maximize accuracy, throughput, and energy-efficiency across various environmental conditions.
Gary Dagastine attended the recent Symposia on VLSI Technology and Circuits in Honolulu.
This year more than 900 scientists, engineers and technologists attended the annual event, which alternates between Hawaii and Kyoto, Japan. It was the highest attendance in 12 years, a noteworthy inflection given the increased cost and difficulty of continued scaling according to Moore’s Law.
And, following the path of Moore’s Law is just one approach. “We have broadened our views on how to move forward,” said Maud Vinet, VLSI committee member and manager of the Advanced CMOS Laboratories at Grenoble-based CEA-Leti. “There are lots of tricks available in terms of materials and new architectures that will help us continue to serve up innovations.”
Researchers at Purdue University and the University of Virginia are now able to create “tiny, thin-film electronic circuits peelable from a surface,” the first step in creating an unobtrusive Internet-of-Things solution. The peelable stickers can sit flush to an object’s surface and be used as sensors or wireless communications systems.
The biggest difference between these stickers and traditional solutions is the removal of the silicon wafer that manufacturers use. Because the entire circuit is transferred right on the sticker there is no need for bulky packages and you can pull off and restick the circuits as needed.
“We could customize a sensor, stick it onto a drone, and send the drone to dangerous areas to detect gas leaks, for example,” said Chi Hwan Lee, Purdue assistant professor. From the release:
A ductile metal layer, such as nickel, inserted between the electronic film and the silicon wafer, makes the peeling possible in water. These thin-film electronics can then be trimmed and pasted onto any surface, granting that object electronic features.
Putting one of the stickers on a flower pot, for example, made that flower pot capable of sensing temperature changes that could affect the plant’s growth.
The system “prints” circuits by etching the circuit on a wafer and then placing the film over the traces. Then, with the help of a little water, the researchers can peel up the film and use it as a sticker. They published their findings in the Proceedings of the National Academy of Sciences.
Activity surrounding the 5nm manufacturing process node is quickly ramping, creating a better picture of the myriad and increasingly complex design issues that must be overcome.
Progress at each new node after 28nm has required an increasingly tight partnership between the foundries, which are developing new processes and rule decks, along with EDA and IP vendors, which are adding tools, methodologies, and pre-developed blocks to make all of this work. But 5nm adds some new twists, including the insertion of EUV lithography for more critical layers, and more physical and electrical effects that could affect everything from signal integrity and yield to aging and reliability after manufacturing.
“For logic, the challenge at 5nm is to properly manage the interaction between the standard cells and the power grid,” said Jean-Luc Pelloie, a fellow in Arm’s Physical Design Group. “The days where you could build a power grid without considering the standard cells are over. The architecture of the standard cells must fit with the power grid implementation. Therefore, the power grid must be selected based on the logic architecture.”
At 5nm, IR drop and electromigration issues are almost be impossible to resolve if this interaction has not been properly accounted for from the beginning.
“The proper power grid also will limit the impact of the back-end-of-line(BEOL) effects, primarily the simple fact that via and metal resistances increase as we continue to shrink into 5nm,” Pelloie said. “In addition to considering the logic architecture for the power grid, a regular, evenly distributed power grid helps reduce this impact. For designs using power gates, those gates need to be inserted more frequently to not degrade the performance. This can result in an increase of the block area and can reduce the area gain when shrinking from the previous process node.”
The migration to each new node below 10/7nm is becoming much more difficult, time-consuming and expensive. In addition to the physical issues, there are changes in methodology and even the assumptions that engeers need to make.
“You’ve got a higher-performance system, you’ve got a more accurate system, so you can do more analysis,” said Ankur Gupta, director of product engineering for the semiconductor business unit at ANSYS. “But a lot of engineering teams still have to move away from traditional IR assumptions or margins. They still have to answer the question of whether they can run more corners. And if they can run more corners, which corners do they pick? That’s the industry challenge. When running EM/IR analysis, it’s a strong function of the vectors that the engineering chooses to run. If I could manufacture the right vectors, I would have done it yesterday, but I can’t.”
Choosing the right vectors isn’t always obvious. “Technology is quickly evolving here as a combination of voltage and timing that can intelligently pick or identify the weak points,” Gupta noted. “That’s not just from a grid weakness perspective, but from the perspective of grid weakness plus sensitivity to delay, to process variation, to simultaneous switching—sensitivity to a bunch of things that ultimately can impact the path and cause a failure.”
This changes the entire design approach, he said. “Can the margins be lowered, and can flows be designed so they are convergent throughout the entire process? Could I potentially use statistical voltages instead of a flat guard band IR drop upfront and then potentially go down to these DVD waveforms — really accurate DVD waveforms — and a path to get high levels of accuracy in the signoff space? Could I potentially analyze chip, package and system? Could I potentially do all of this analysis so I don’t waste 5% margin coming from the package into the chip? At 7nm, we were talking about near-threshold compute, as in some corners are at NTC, not the entire chip, because you look at the mobile guys and they’re not always running sub-500. There are some conditions and modes where you’ll be running at sub-500, but at 5nm because of the overall thermal envelope and the overall power consumption budget, the mobile guys are probably going to be running all corners sub-600 millivolts.”
It’s not just mobile. The same is true for networking, GPUs, or AI chips, because a lot of these designs have the same total power envelope restrictions. They are packaging so many transistors into a small space that the total power consumption will dictate the max operating voltage. “You can’t burn enough power if you’re upgrading, you don’t have enough power to burn at 800 millivolts or so if the entire chip now starts to operate at 600 millivolts or lower,” Gupta said. “Then you take tens of sub-500 millivolt corners and that becomes your entire design, which puts you in the land of ‘must-have these [analysis] technologies.’ Next to 7nm, we are seeing the variation impact at 5nm in early versions of spice models is worse.”
Many of these technology and design issues have been getting worse for several nodes.
“There are more challenging pin access paradigms, more complex placement and routing constraints, more dense power-ground grid support, tighter alignment necessary between library architecture and PG grid, more and tighter electromigration considerations, lower supply voltage corners, more complex library modeling, additional physics detail in extraction modeling, more and new DRC rules,” said Mitch Lowe, vice president of R&D at Cadence. “Obviously, EUV lithography is critical, which does reduce but not eliminate multi-patterning challenges and impacts. While some things are simplified by EUV, there are some new challenges that are being addressed.”
The EDA community has been working on these issues for some time. “We are at the stage to see leading EDA solutions emerge,” Lowe said. “Much more work is ahead of us, but it is clear the 5nm technologies will be successfully deployed.”
The EDA ecosystem is heavily investing in continuous PPA optimization and tightening correlation through the integration of multiple common engines. One example is combining IR drop impacts with static timing analysis (STA) to manage the increasing risks inherent in using traditional margining approaches at 5nm, Lowe said.
Other changes may be required, as well. Mark Richards, marketing manager for the design group at Synopsys, noted that 5nm is still immature, with various foundries at different points in their development plans and execution.
“Outside of the main foundry players, which are aggressively moving to deliver a production ready flow in a very short timeframe, research is being conducted on new architectures for transistors, because to some degree the finFET is being stretched to its limit toward the 5nm node,” Richards said. “This is why there is somewhat of a tailing off in top-line performance benefits, as reported by the foundries themselves. As you deploy fin-depopulation to meet area shrink goals, this necessitates an increase in the height of the fin to mitigate the intrinsic drive reduction. That brings intrinsic capacitance issues and charging and discharging those capacitances is problematic from a performance perspective,” he explained.
Samsung and GlobalFoundries have announced plans to move to nanosheet FETs at 3nm, and TSMC is looking at nanosheet FETs and nanowires at that node. All of those are gate-all-around FETs, which are needed to reduce gate leakage beyond 5nm. There also are a number of nodelets, or stepping-stone nodes along the way, which reduce the impact of migrating to entirely new technologies.
Fig. 1: Gate-all-around FET. Source: Synopsys
At 5nm, a very strong increase in both electrical and thermal parasitics is expected, Dr. Christoph Sohrmann, advanced physical verification at Fraunhofer Institute for Integrated Circuits IIS, said. “First of all, the FinFETdesign will suffer from stronger self-heating. Although this will be taken care of from the technology side, the reduced spacing is a design challenge which cannot entirely be coved by static design rules. The enhanced thermal/electrical coupling across the design will effectively increase to a point where sensitive parts of the chip such as high-performance SerDes may suffer from a limited peak performance. However, this depends strongly on the use case and the isolation strategy. Choosing the right isolation technique — like design-wise and technology — requires more accurate and faster design tools, particularly focused at the parasitics in those very advanced nodes. We expect to see a lot of new physical effects which need to go into those tools. This is not too far away from quantum scale. To get the physics right, a lot of test structures will be required to fit the models of those novel tools. This is a time consuming and expensive challenge. Fewer heuristical models are also expected, with more real physical approaches in the models. On top of that, the foundries will be very cautious about those parameters and models. All future standards in this area need to account for this, too.”
Then, for 3nm and beyond, there will have to be a move to new transistor structures to continue to achieve the performance benefits that are expected at new nodes, Richards said. “With the increased introduction of stepping-stone nodes, you’re basically borrowing from the next node to some degree. When you throw a node in the middle, you kind of borrow from the next node as far as what the projected benefits will be. That’s what we’re seeing in some of these boutique nodes in between, but they are important given end-customer demand, and they do enable our customers to hit aggressive product-delivery windows.”
For any new process node, tremendous investment is required by the EDA and IP community to make sure tools, libraries and IP are aligned with the new technical specifications and capabilities. Part of this is the process design kit that design teams must adhere to for that new node.
Across the industry, there is a lot of development work ongoing for cell and IP development. “In real terms, the biggest amount of change and development work materializes in or before the 0.5-level PDK,” Richards said. “Generally, from 0.5 onward, there is a reduced delta to what the PDK would be expected change. So normally everything’s done. Between pathfinding, 0.1 and 0.5, the big majority is done, then the rest tapers off because by that point you’ve had numerous customers do test chips, so the amount of change required is reduced. Beyond that point it’s really about building out and maturing the reference flows, building out methodologies, and really bolstering those in that 0.5 to 1.0 timeframe to make sure the promise from the scaling and the performance perspective are going to be realizable in real chips.”
Fig. 2: 5nm nanosheet. Source: IBM
To move or not to move Another consideration many semiconductor companies are currently facing is not to migrate to the next node, or at least not so quickly, or whether to move in completely different directions.
“New architectures are going to be accepted,” said Wally Rhines, president and CEO of Mentor, a Siemens Business. “They’re going to be designed in. They will have machine learning in many or most cases, because your brain has the ability to learn from experience. I visited 20 or more companies doing their own special-purpose AI processor of one sort or another, and they each have their own little angle. But you’re going to see them in specific applications increasingly, and they will complement the traditional von Neumann architecture. Neuromorphic computing will become mainstream, and it’s a big piece of how we take the next step in efficiency of computation, reducing the cost, doing things in both mobile and connected environments that today we have to go to a big server farm to solve.”
Others are expected to stay the course, at least for now.
“Many of our customers are already engaged in 5nm work,” Richards said. “They’re trying to work out what this node shift brings for them because obviously the scaling benefits on paper are very different to the scaling benefits that they can realize in a real design — their own designs with their own specific challenges — and so they’re trying to work out what is a real scaling, what are the real performance benefits, is this tractable, is it a good methodology to use, and a good plan from a product perspective.”
Today, the expectation for early adoption of 5nm will be mobile applications, he said. “TSMC itself quoted a 20% bump from N7, and, to my knowledge, an unknown bump from 7++ . Realistically, mobile is a good application, where area – slated to be 45% vs. N7 – is really going to provide a big differentiation. You’ll get the power and performance benefits that are also important but with the latest IP cores growing in complexity and area, you need to have the freedom to develop a differentiated cluster and aggressive area shrinks will allow for that.”
The key metrics are always performance, power and area, and the tradeoffs between all of those are becoming more difficult. Increasing performance brings a subsequent increase in dynamic power, which makes IR drop more challenging. That requires more time to be spent tuning the power grid so designs can deliver enough power, but not kill the design routability along the way.
“The key thing with power really is how to get power down to the standard cells,” said Richards. “You just can’t put the cells close enough together because it spoils the resources with power grid. This means working early in the flow with power and its implications. On an SoC design you might see very different power grids, depending on the performance requirements of each of the blocks on the SoC. It’s not just a one size fits all. It must be tuned per block, and that’s challenging in itself. Having the analysis and the sign-off ability within the design platform is now going to become more and more important as you make those tradeoffs.”
Narrower margin At the same time, the margin between the threshold and the operating voltages is now so small at 5nm that extra analysis is a must.
TSMC and Samsung both have mentioned extreme low-Vt cells, which are paramount for really pushing performance at 5nm, where the threshold and operating voltage very close together.
“The nonlinearities and the strange behaviors that happen when you’re in that phase need to be modeled and captured to be able to drop it as low as possible,” he said. “Obviously LVF (Liberty Variation Format) was required at 7nm, for when the operating voltage was getting very, very low and very close to the threshold, but now even when you’re running what you would not consider a extremely low power design with extremely low voltage Vt cells effectively, you’re back in the same position. You’ve closed that gap again, and now LVF and modeling those things is very important.”
Inductance, electromagnetic effects Indeed, with the move to 7nm and 5nm, the trends are clear: increasing frequencies, tighter margins, denser integrated circuits, and new devices and materials, stressed Magdy Ababir, vice president of marketing at Helic. He noted during the recent Design Automation Conference, a panel discussed and debated such concepts as: where and when should full electromagnetic (EM) verification be included; whether ignoring magnetic effects leads to more silicon failures during development; whether the methodology of applying best practices to avoid EM coupling and skipping the tedious EM verification part should still be a valid practice; if this methodology is scalable to 5nm integrated circuits and below; if the dense matricies resulting from inductive coupling and difficulty of simulations are the main reason why industry did not widely adopt full EM simulations; and what can be done in-terms of tool development, education, and research to lower the barrier for industry to adopt full EM simulation.
“The panel members all agreed strongly that the full EM analysis is becoming fundamental in at least some key parts of any cutting-edge chip. A panelist from Synopsys was of the opinion that is needed in some key places in a chip such as clocking, wide data busses, and power distribution, but not yet in mainstream digital design. An Intel panelist was of the opinion that for current chips, applying best practices and skipping using full EM simulations still works, however this methodology will not scale into the future. A panelist from Nvidia simply stated that EM simulations is a must with his very high frequency SERDES designs, and a panelist from Helic agreed strongly here, and showed examples of unexpected EM coupling causing failures in key chips. The moderator was of the opinion that magnetic effects are already there strongly and have been very significant in integrated circuits for a while, but the difficulty of including magnetic effects into simulation, and manipulating very large and dense matrices resulting from inductive coupling is the main reason full EM verification is not mainstream yet. Everyone agreed that not including EM effects in verification results in overdesign at best and potential failures,” Abadir offered.
In the end, the panel agreed that there is a need for significant improvement of tools that handle EM verification, better understanding of magnetic effects, and significant research on how to protect against EM failures or even adopt designs that benefit from magnetic effects. The panel also agreed that current trends of higher frequencies, denser circuits, and scaling of devices combined with the exploding penalty on a chip failure, makes including full EM verification imperative, he added.
An additional challenge at 5nm is the accuracy of waveform propagation. Waveform propagation is notoriously expensive from a runtime perspective, and as a result needs to be captured throughout the entire design flow. Otherwise, the surprise at sign-off would be that the design is too big to close.
The typical way to solve these problems is by adding margin into the design. But margining has become an increasingly thorny issue ever since the advent of finFETs, because dimensions are so small that extra circuitry reduces the PPA benefits of scaling. So rather than just adding margin, design teams are being forced to adhere to foundry models and rules much more closely.
“Foundries do provide models of devices that represent corner models,” said Deepak Sabharwal, vice president of IP engineering at eSilicon. “In the past, you were told the corner models capture the extremes of what would be manufactured, but that is no longer the case. Today, there are still corner models, but there are also variation models, both global and local. Global variation capture the global means of manufacturing, such as when multiple lots are run at a foundry, each lot is going to behave in a certain manner and that is captured as part of my global variation. Local variation models represent when I’m on a die and my die has a Gig of elements. Then I have the middle point of my distribution, and what the outliers are on that distribution.”
At 5nm, both the global plus the local variation must be considered, because they are incremental.
“At the same time, these kinds of analysis are experience-driven,” Sabharwal said. “How much margin do you add, and also make sure you do not go overboard? If you design for too much of your sigma, you ended up being uncompetitive. That’s what you have to watch out for, and that’s really where the experience comes in. You have to make sure you put in enough margin that you can sleep at night, but not kill your product by putting in too much extra area that you don’t need to put in.”
More than ever, 5nm brings together a range of new challenges. “When you think about the billions of components sitting on that chip, it explains why the size of the teams needed to build these chips is now increasing as you flip from one generation to the next. It’s all these challenges that are coming our way. These problems are going to remain, where people will come up with techniques to resolve them and just continue business as usual. Engineering is really that art of building stuff that will work reliably all the time,” Sabharwal said.
The prefab modular grad student housing building at 2711 Shattuck Ave. (left) photographed on Aug. 1. Photo: Tracey TaylorImagine a four-story apartment building going up in four days, and from steel. -------------------------------------------
It happened in Berkeley, a city known for its glacial progress in building housing.
Check out 2711 Shattuck Ave. near downtown Berkeley. Four stories. Four days in July. Including beds, sinks, sofas, and stoves.
This new 22-unit project from local developer Patrick Kennedy (Panoramic Interests) is the first in the nation to be constructed of prefabricated all-steel modular units made in China. Each module, which looks a little like sleekly designed shipping containers with picture windows on one end, is stacked on another like giant Legos.
The project, initially approved by the city in 2010 as a hotel, then re-approved in 2015 as studio apartments, will be leased to UC Berkeley for graduate student housing. Called Shattuck Studios, it’s slated to be open for move-in for the fall semester.
The Cal grad student housing at 2711 Shattuck Ave. is slated to open at the end of August. Rendering: Panoramic Interests ---------------------------------------------------
“This is the first steel modular project from China in America,” Kennedy said, adding that new tariffs on imported Chinese steel hadn’t affected this project.
The modules were shipped to Oakland then trucked to the site. Kennedy notes that the cost of trucking to Berkeley from the port of Oakland was more expensive than the cost of shipping from Hong Kong.
The modules are effectively ready-to-go 310-square-feet studio apartments with a bathroom, closets, a front entry area, and a main room with a kitchenette and sofa that converts to a queen-size bed. They come with flat-screen TVs and coffee makers.
“In order to be feasible, modular construction requires standardized unit sizes and design, and economies of scale,” Kennedy said.
The complex has no car parking, but 22 bicycle parking spots. It has no elevator, and no interior common rooms except hallways, but has a shared outdoor patio/BBQ area. ADA accessible units are on the ground floor.
Floors in each unit are bamboo and tile. The appliances are stainless steel. The bathroom has an over-sized shower. The entry room has a “gear wall” for hanging backpacks, skateboards, bike helmets. Colors are grays and beiges and light browns.
“Our units reflect the more austere, minimalist NorCal sensibility,” Kennedy said, during a recent tour of the complex. “Less but better.”
Interior view of 2711 Shattuck Ave, Rendering: Panoramic Interests ---------------------------------------------
The modules were stacked on a conventional foundation. Electricity, plumbing, the roof, landscaping and other infrastructure were added.
Using prefab material is supposed to be less expensive than building from scratch, Kennedy said. He had anticipated significantly lower costs by going prefab for this project.
But the savings haven’t been as great as expected, he said. “Sixty-five to seventy-five-percent of the construction costs are still incurred on the site. In addition to the usual trades, we have crane operators, flagmen, truckers and special inspectors.”
He’s s still evaluating bottom-line costs.
“We are very happy with the quality of construction and the finished product — but we learned that smaller sites posed lots of difficulties — access, traffic management, proximity to neighbors,” said Kennedy who works with Pankow Builders of Oakland. “We might have saved some money building this conventionally, but we view this more as a research & development project — and in that capacity, it was very helpful and educating.”
Crane hoisting prefab modules for new UC Berkeley housing at 2711 Shattuck Ave. Photo: Panoramic Interests --------------------------------------------
Prefab construction probably makes more financial sense with larger projects (more units) on larger lots, Kennedy said. “If you don’t have space to work it gets very expensive very quickly.”
The goal — and hope — is that prefab will open the door to more affordable housing through lower construction costs. “We’re still trying to determine the optimal size. It’s a pretty new idea here in Northern California. We are learning as we go,” he said.
Kennedy said he knows of a few locations in the West Coast that sell similar modules, but they’re backlogged by years. So he went overseas. “The industry is evolving rapidly, and we are always looking to bring down costs. . . We would love to use local firms.” He built one previous prefab apartment project in San Francisco with a Sacramento manufacturer who is now out of business.
In lieu of providing affordable units on site, Kennedy will pay a fee to the city of Berkeley’s Affordable Housing Trust Fund, as required under the city’s affordable housing laws. The amount is around $500,000, he said.
In a few weeks, roughly four months from the start of construction, nearly two dozen UC Berkeley graduate students should be moving into the complex.
Inside a studio at 2711 Shattuck Ave. Photo: Panoramic Interests -------------------------------------
The units will rent for $2,180 monthly for single-occupancy, said Kyle Gibson, director of communications for UC Berkeley Capital Strategies. One unit is reserved for a residential assistant (RA). UC has a three-year lease with Kennedy’s firm.
Panoramic Interests will do building maintenance and cleaning.
Gibson said the university wasn’t involved in the design or construction, and he had no comment on the prefab approach. The project is one of several new developments recently completed or in the pipeline to increase student housing, he said. Some are university-built and owned, others leased.
“The University welcomes any and all projects and developments that expand the availability of affordable, accessible student housing in close proximity to campus,” Gibson said.
“It’s been an incredibly valuable tutorial for us. We know prefab is going to be the future, we just don’t know how we’re going to be part of it,” Kennedy said. “I’m chastened by the complexity of doing something so seemingly simple as stacking boxes on top of each other.”
The world’s most powerful particle collider has yet to turn up new physics — now some physicists are turning to a different strategy.
Davide CastelvecchiThe ATLAS detector at the Large Hadron Collider near Geneva, Switzerland.Credit: Stefano Dal Pozzolo/Contrasto /eyevineA once-controversial approach to particle physics has entered the mainstream at the Large Hadron Collider (LHC). The LHC’s major ATLAS experiment has officially thrown its weight behind the method — an alternative way to hunt through the reams of data created by the machine — as the collider’s best hope for detecting behaviour that goes beyond the standard model of particle physics. Conventional techniques have so far come up empty-handed.
So far, almost all studies at the LHC — at CERN, Europe’s particle-physics laboratory near Geneva, Switzerland — have involved ‘targeted searches’ for signatures of favoured theories. The ATLAS collaboration now describes its first all-out ‘general’ search of the detector’s data, in a preprint posted on the arXiv server 1 last month and submitted to European Physics Journal C. Another major LHC experiment, CMS, is working on a similar project.
“My goal is to try to come up with a really new way to look for new physics” — one driven by the data rather than by theory, says Sascha Caron of Radboud University Nijmegen in the Netherlands, who has led the push for the approach at ATLAS. General searches are to the targeted ones what spell checking an entire text is to searching that text for a particular word. These broad searches could realize their full potential in the near future, when combined with increasingly sophisticated artificial-intelligence (AI) methods.
LHC researchers hope that the methods will lead them to their next big discovery — something that hasn’t happened since the detection of the Higgs boson in 2012, which put in place the final piece of the standard model. Developed in the 1960s and 1970s, the model describes all known subatomic particles, but physicists suspect that there is more to the story — the theory doesn’t account for dark matter, for instance. But big experiments such as the LHC have yet to find evidence for such behaviour. That means it's important to try new things, including general searches, says Gian Giudice, who heads CERN’s theory department and is not involved in any of the experiments. “This is the right approach, at this point.”
Collision courseThe LHC smashes together millions of protons per second at colossal energies to produce a profusion of decay particles, which are recorded by detectors such as ATLAS and CMS. Many different types of particle interaction can produce the same debris. For example, the decay of a Higgs might produce a pair of photons, but so do other, more common, processes. So, to search for the Higgs, physicists first ran simulations to predict how many of those ‘impostor’ pairs to expect. They then counted all photon pairs recorded in the detector and compared them to their simulations. The difference — a slight excess of photon pairs within a narrow range of energies — was evidence that the Higgs existed.
ATLAS and CMS have run hundreds more of these targeted searches to look for particles that do not appear in the standard model. Many searches have looked for various flavours of supersymmetry, a theorized extension of the model that includes hypothesized particles such as the neutralino, a candidate for dark matter. But these searches have come up empty so far.
This leaves open the possibility that there are exotic particles that produce signatures no one has thought of — something that general searches have a better chance of finding. Physicists have yet to look, for example, events that produced three photons instead of two, Caron says. “We have hundreds of people looking at Higgs decay and supersymmetry, but maybe we are missing something nobody thought of,” says Arnd Meyer, a CMS member at Aachen University in Germany.
Whereas targeted searches typically look at only a handful of the many types of decay product, the latest study looked at more than 700 types at once. The study analysed data collected in 2015, the first year after an LHC upgrade raised the energy of proton collisions in the collider from 8 teraelectronvolts (TeV) to 13 TeV. At CMS, Meyer and a few collaborators have conducted a proof-of-principle study, which hasn’t been published, on a smaller set of data from the 8 TeV run.
Neither experiment has found significant deviations so far. This was not surprising, the teams say, because the data sets were relatively small. Both ATLAS and CMS are now searching the data collected in 2016 and 2017, a trove tens of times larger.
Statistical consThe approach “has clear advantages, but also clear shortcomings”, says Markus Klute, a physicist at the Massachusetts Institute of Technology in Cambridge. Klute is part of CMS and has worked on general searches in at previous experiments, but he was not directly involved in the more recent studies. One limitation is statistical power. If a targeted search finds a positive result, there are standard procedures for calculating its significance; when casting a wide net, however, some false positives are bound to arise. That was one reason that general searches had not been favoured in the past: many physicists feared that they could lead down too many blind alleys. But the teams say they have put a lot of work into making their methods more solid. “I am excited this came forward,” says Klute.
Most of the people power and resources at the LHC experiments still go into targeted searches, and that might not change anytime soon. “Some people doubt the usefulness of such general searches, given that we have so many searches that exhaustively cover much of the parameter space,” says Tulika Bose of Boston University in Massachusetts, who helps to coordinate the research programme at CMS.
Many researchers who work on general searches say that they eventually want to use AI to do away with standard-model simulations altogether. Proponents of this approach hope to use machine learning to find patterns in the data without any theoretical bias. “We want to reverse the strategy — let the data tell us where to look next,” Caron says. Computer scientists are also pushing towards this type of ‘unsupervised’ machine learning — compared with the supervised type, in which the machine ‘learns’ from going through data that have been tagged previously by humans.
Thanks to the modern electric grid, you have access to electricity whenever you want. But the grid only works when electricity is generated in the same amounts as it is consumed. That said, it’s impossible to get the balance right all the time. So operators make grids more flexible by adding ways to store excess electricity for when production drops or consumption rises.
About 96% of the world’s energy-storage capacity comes in the form of one technology: pumped hydro. Whenever generation exceeds demand, the excess electricity is used to pump water up a dam. When demand exceeds generation, that water is allowed to fall—thanks to gravity—and the potential energy turns turbines to produce electricity.
But pumped-hydro storage requires particular geographies, with access to water and to reservoirs at different altitudes. It’s the reason that about three-quarters of all pumped hydro storage has been built in only 10 countries. The trouble is the world needs to add a lot more energy storage, if we are to continue to add the intermittent solar and wind power necessary to cut our dependence on fossil fuels.
A startup called Energy Vault thinks it has a viable alternative to pumped-hydro: Instead of using water and dams, the startup uses concrete blocks and cranes. It has been operating in stealth mode until today (Aug. 18), when its existence will be announced at Kent Presents, an ideas festival in Connecticut.
On a hot July morning, I traveled to Biasca, Switzerland, about two hours north of Milan, Italy, where Energy Vault has built a demonstration plant, about a tenth the size of a full-scale operation. The whole thing—from idea to a functional unit—took about nine months and less than $2 million to accomplish. If this sort of low-tech, low-cost innovation could help solve even just a few parts of the huge energy-storage problem, maybe the energy transition the world needs won’t be so hard after all.
?? Quartz is running a series called The Race to Zero Emissions that explores the challenges and opportunities of energy-storage technologies. Sign up here to be the first to know when stories are published.
Concrete planThe science underlying Energy Vault’s technology is simple. When you lift something against gravity, you store energy in it. When you later let it fall, you can retrieve that energy. Because concrete is a lot denser than water, lifting a block of concrete requires—and can, therefore, store—a lot more energy than an equal-sized tank of water.
Bill Gross, a long-time US entrepreneur, and Andrea Pedretti, a serial Swiss inventor, developed the Energy Vault system that applies this science. Here’s how it works: A 120-meter (nearly 400-foot) tall, six-armed crane stands in the middle. In the discharged state, concrete cylinders weighing 35 metric tons each are neatly stacked around the crane far below the crane arms. When there is excess solar or wind power, a computer algorithm directs one or more crane arms to locate a concrete block, with the help of a camera attached to the crane arm’s trolley.
Simulation of a large-scale Energy Vault plant.Once the crane arm locates and hooks onto a concrete block, a motor starts, powered by the excess electricity on the grid, and lifts the block off the ground. Wind could cause the block to move like a pendulum, but the crane’s trolley is programmed to counter the movement. As a result, it can smoothly lift the block, and then place it on top of another stack of blocks—higher up off the ground.
The system is “fully charged” when the crane has created a tower of concrete blocks around it. The total energy that can be stored in the tower is 20 megawatt-hours (MWh), enough to power 2,000 Swiss homes for a whole day.
When the grid is running low, the motors spring back into action—except now, instead of consuming electricity, the motor is driven in reverse by the gravitational energy, and thus generates electricity.
Big upThe innovation in Energy Vault’s plant is not the hardware. Cranes and motors have been around for decades, and companies like ABB and Siemens have optimized them for maximum efficiency. The round-trip efficiency of the system, which is the amount of energy recovered for every unit of energy used to lift the blocks, is about 85%—comparable to lithium-ion batteries which offer upto 90%.
Pedretti’s main work as the chief technology officer has been figuring out how to design software to automate contextually relevant operations, like hooking and unhooking concrete blocks, and to counteract pendulum-like movements during the lifting and lowering of those blocks.
Energy Vault keeps costs low because it uses off-the-shelf commercial hardware. Surprisingly, concrete blocks could prove to be the most expensive part of the energy tower. Concrete is much cheaper than, say, a lithium-ion battery, but Energy Vault would need a lot of concrete to build hundreds of 35-metric-ton blocks.
So Pedretti found another solution. He’s developed a machine that can mix substances that cities often pay to get rid off, such as gravel or building waste, along with cement to create low-cost concrete blocks. The cost saving comes from having to use only a sixth of the amount of cement that would otherwise have been needed if the concrete were used for building construction.
Akshat Rathi for Quartz
Rob Piconi (left) and Andrea Pedretti.The storage challengeThe demonstration plant I saw in Biasca is much smaller than the planned commercial version. It has a 20-meter-tall, single-armed crane that lifts blocks weighing 500 kg each. But it does almost all the things its full-scale cousin, which the company is actively looking to sell right now, would do.
Robert Piconi has spent this summer visiting countries in Africa and Asia. The CEO of Energy Vault is excited to find customers for its plants in those parts of the world. The startup also has a sales team in the US and it now has orders to build its first commercial units in early 2019. The company won’t share details of those orders, but the unique characteristics of its energy-storage solution mean we can make a fairly educated guess at what the projects will look like.
Energy-storage experts broadly categorize energy-storage into three groups, distinguished by the amount of energy storage needed and the cost of storing that energy.
First, expensive technologies, such as lithium-ion batteries, can be used to store a few hours worth of energy—in the range of tens or hundreds of MWh. These could be charged during the day, using solar panels for example, and then discharged when the sun isn’t around. But lithium-ion batteries for the electric grid currently cost between $280 and $350 per kWh.
Cheaper technologies, such as flow batteries (which use high-energy liquid chemicals to hold energy) can be used to store weeks worth of energy—in the range of hundreds or thousands of MWh. This second category of energy storage could then be used, for instance, when there’s a lull in wind supply for a week or two.
The third category doesn’t exist yet. In theory, yet-to-be-invented, extra-cheap technologies could store months worth of energy—in the range of tens or hundreds of thousands of MWh—which would be used to deal with interseasonal demands. For example, Mumbai hits peak consumption in the summer when air conditioners are on full blast, whereas London peaks in winters because of household heating. Ideally, energy captured in one season could be stored for months during low-use seasons, and then deployed later in the high-use seasons.
David vs GoliathPiconi estimates that by the time Energy Vault builds its 10th or so 35-MWh plant, it can bring costs down to about $150 per kWh. That means it can’t fill the needs of the third category of energy-storage use; to do that, costs would have to be closer to $10 per kWh. In theory, at the current capacity and price point, it could compete in the second category—if it could find a customer that wanted Energy Vault to build dozens of plants for a single grid. Realistically, Energy Vault’s best bet is to compete in the first category.
That said, some experts told Quartz that the cost of lithium-ion batteries, the current dominant battery technology, could fall to about $100 per kWh, which would make them cheaper even than Energy Vault when it comes to storing days or weeks worth of energy. And because batteries are compact, they can be transported vast distances. Most of the lithium-ion batteries in smartphones used all over the world, for example, are made in East Asia. Energy Vault’s concrete blocks will have to be built on-site, and each 35 MWh system would need a circular piece of land about 100 meters (300 feet) in diameter. Batteries need a fraction of that space to store the same amount of energy.
Batteries do have some limitations. The maximum life of lithium-ion batteries, for example, is 20 or so years. They also lose their capacity to store energy over time. And there aren’t yet reliable ways to recycle lithium-ion batteries.
Energy Vault’s plant can operate for 30 years with little maintenance and almost no fade in capacity. Its concrete blocks also use waste materials. So Piconi is confident that there’s still a niche that Energy Vault can fill: Places that have abundant access to land and building material, combined with the desire to have storage technologies that last for decades without fading in capacity.
Meanwhile, whether or not Energy Vault succeeds, it does make a strong case for the argument that, while everyone else is out looking for high-tech, futuristic battery innovation, there may be real value in thinking about how to apply low-tech solutions to 21st-century problems. Energy Vault built a functional test plant in just nine months, spending relative pennies. It’s a signal of sorts that some of the answers to our energy-storage problems may still be sitting hidden in plain sight.
This article was updated with information about Energy Vault’s first commercial-unit orders.
?? Quartz is running a series called The Race to Zero Emissions that explores the challenges and opportunities of energy-storage technologies. Sign up here to be the first to know when stories are published.
For as long as anyone can remember, EUV has been “just a few years away.” This changed back in 2016 when Samsung put their foot down, announcing that their 8nm node will be the last DUV-based process technology. All nodes moving forward will use EUV. As Yan Borodovsky said at the 2018 SPIE conference, EUV is no longer a question of if or when but how well. At the 2018 Symposia on VLSI Technology and Circuits, Samsung gave us a first glimpse of what their 7nm EUV process looks like. Samsung’s second-generation 7nm process technology was presented by WonCheol Jeong, Principal Research Engineer at Samsung.
2nd Generation 7nm?What Samsung presented at the symposia was what they consider “2nd generation 7nm”. Samsung naming is confusing and almost-intentionally obfuscated. I have asked Jeong about this and he said that by 2nd generation, they are referring to Samsung’s “7LPP” whereas their 1st generation refers to “7LPE” which will likely never see the light of day. Unfortunately, WikiChip has been through this situation before with Samsung’s presentation of their “2nd generation 10nm” last year which ended up being 8nm “8LPP”, therefore it’s entirely possible that this 2 gen 7nm node really refers to their “6nm” or “5nm” nodes. To avoid possible confusion, we will not be using “7LPP” and, instead, stick to the name Samsung used in their presentation (“2nd Gen 7nm”).
Design FeaturesSamsung’s second-generation 7nm process builds on many of their earlier technologies developed over the years.
5th generation FinFET2nd generation hybrid N/P5th generation S/D engineering3rd generation gate stackWhat’s interesting is that both their 2nd generation 7nm and their 8nm 8LPP share much of those rules including the fin, SD, and gate engineering. In fact, we can show the overlap much better in a table below which includes their 14, 10, 8, and 7 nanometer nodes.
Samsung Technology ComparisonTechnology14LPP10LPP1st Gen 7nm8LPP2nd Gen 7nmFinGateS/D EngSDBGate Stack
From a technology point of view, 8LPP shares many of the device manufacturing details with 2nd Gen 7nm, more so than the first-generation 7nm.
All the pitches reported above are the tightest numbers reported to date for a leading edge foundry.
EUVFor their 10nm, Samsung has been using Litho-Etch-Litho-Etch-Litho-Etch (LELELE or LE3). For their 7nm, Samsung has eliminated most of the complex patterning by using a single-exposure EUV for the three critical layers – fin, contact, and Mx. Samsung reports a mask reduction of >25% when compared to using ArF immersion lithography for comparable features which translates to cost and time reduction.
EUV mask reduction compared to ArF MPT (VLSI 2018, Samsung)CellFor their 7nm, Samsung’s high-density cell has a height of 9 fins or 243nm which works out to 6.75 tracks. This is a cell height reduction of 0.58x over their 10nm or 0.64x over their 8nm.
Samsung’s 14nm, 10nm, 8nm, and 7nm std cells (WikiChip)The high-density cell is a 2-fin device configuration.
10, 8, and 7 nanometer device configuration (WikiChip)For a NAND2 cell, 7nm take up a total area of 0.0394 µm², down from 0.0723 µm² in 8nm or 0.086 µm² in 10nm. That’s a 0.54x and 0.46x scaling for 8nm and 10nm respectively.
NAND2 Cell Scaling (WikiChip)HP CellIn addition to the high-density, Samsung also offers a high-performance cell.
2nd Generation 7nm Std CellCellDeviceHeightTracks
243nm 9-fin x 27nm
270nm 10-fin x 27nm
Spotted an error? Help us fix it! Simply select the problematic text and press Ctrl+Enter to notify us.
Pattern FidelityOne of the many limitations with conventional multi-patterning techniques is pattern fidelity. What you see is often not what you get.
(VLSI 2018, Samsung)For their 7nm, Samsung is reporting EUV 2D fidelity to be 70% better than ArF multi-patterning.
The alloy is 100 times more durable than high-strength steel, putting it in the same class as diamond and sapphire for wear-resistant materials. “We showed there’s a fundamental change you can make to some alloys that will impart this tremendous increase in performance over a broad range of real, practical metals,” said Nic Argibay, a materials scientist at Sandia.
Strictly speaking, the three researchers – Sergey Bravyi, David Gosset, and Robert König – have shown that
parallel quantum algorithms running in a constant time period are strictly more powerful than their classical counterparts; they are provably better at solving certain linear algebra problems associated with binary quadratic forms.
The proof they provided is based on an algorithm to solve a quadratic “hidden linear function” problem that can be implemented in quantum constant-depth. A hidden linear function is a linear function that is not entirely known but is “hidden” inside of another function you can calculate. For example, a linear function could be hidden inside of an oracle that can be queried. The challenge is to fully characterize the hidden linear function based on the results of applying the known function. If this sounds somewhat similar to the problem of inverting a public key to find its private counterpart, it is no surprise, since this is exactly what it is about. In the case of an oracle, the problem is solved by the classical Bernstein-Vazirani algorithm, which minimizes the number of queries to the oracle. Now, according to the three researchers, the fact that the Bernstein-Vazirani algorithm is applied to an oracle limits its practical applicability, so they suggest “hiding” a linear function inside a bidimensional grid graph. After proving that this is indeed possible, they built a quantum constant-depth algorithm to find the hidden function out.
The other half of the proof provided by the researchers is showing that, contrary to a quantum circuit, any classical circuit needs to increase its depth as the number of inputs grows. For example, while the quantum algorithm can solve that problem using at most a quantum circuit of depth 10 no matter how many inputs you have, you need, say, a classical circuit of depth 10 for a 16 inputs problem; a circuit of depth 14 for a 32 inputs problem; a circuit of depth 20 for a 64 inputs problem, and so on. This second part of the proof is philosophically deeply interesting, since it dwells on the idea of quantum nonlocality, which in turn is related to quantum entanglement, one of the most peculiar properties of quantum processors along with superposition. So, quantum advantage would seem to derive from the most intrinsic properties of quantum physics.
At the theortical level, the value of this achievement is not to be underestimated either. As IBM IBM Q Vicepresident Bob Sutor wrote:
The proof is the first demonstration of unconditional separation between quantum and classical algorithms, albeit in the special case of constant-depth computations.
Previously, the idea that quantum computer were more powerful than classical ones was based on factorization problems. Shor showed quantum computers can factor an integer in polynomial time, i.e. more efficiently than any know classical computer algorithms. Albeit an interesting result, this did not rule out the possibility that a more efficient classical factorization algorithm could indeed be found. So unless one conjectured that no efficient solution to the factorization problem could exist, which is equivalent to demonstrate that “ P ? NP”, one could not really say that quantum advantage was proved.
As mentioned, Bravyi, Gosset, and König’s algorithm, relying on a constant number of operations (the depth of a quantum circuit) seems to fit just right with the limitation of current quantum computer processors. Those are basically related to qubits’ error rate and coherence time, which limit the maximal duration of a sequence of operations and their overall number. Therefore, using short-depth circuits is key for any feasible application of current quantum circuits. Thanks to this property of the proposed algorithm, IBM researchers are already at work ot demonstrate quantum advantage using IBM quantum computer, Sutor remarks.