SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.

   Technology StocksNew Technology


Previous 10 
From: FJB7/16/2023 10:53:47 AM
   of 426
 
HOW TO USE AI



oneusefulthing.org

How to Use AI to Do Stuff: An Opinionated Guide
Ethan Mollick

15–19 minutes







Discover more from One Useful Thing
Translating academic research into mostly useful insights, with some ephemera on the side. Mostly AI stuff recently. By Prof. Ethan Mollick

How to Use AI to Do Stuff: An Opinionated GuideCovering the state of play as of Summer, 2023


Increasingly powerful AI systems are being released at an increasingly rapid pace. This week saw the debut of Claude 2, likely the second most capable AI system available to the public. The week before, Open AI released Code Interpreter, the most sophisticated mode of AI yet available. The week before that, some AIs got the ability to see images.

And yet not a single AI lab seems to have provided any user documentation. Instead, the only user guides out there appear to be Twitter influencer threads. Documentation-by-rumor is a weird choice for organizations claiming to be concerned about proper use of their technologies, but here we are.

I can’t claim that this is going to be a complete user guide, but it will serve as a bit of orientation to the current state of AI. I have been putting together a Getting Started Guide to AI for my students (and interested readers) every few months, and each time, it requires major modifications. The last couple of months have been particularly insane.

This guide is opinionated, based on my experience, and focused on how to pick the right tool to do things. I have written separately about the kinds of tasks you may want AI to do, which might be useful to read first.

When we talk about AI right now, we are usually talking about Large Language Models, or LLMs. Most AI applications are powered by LLMs, of which there are just a few Foundation Models, created by a handful of organizations. Each company gives direct access to their models via a Chatbot: OpenAI makes GPT-3.5 and GPT-4, which power ChatGPT and Microsoft’s Bing (access it on an Edge browser). Google has a variety of models under the label of Bard. And Anthropic makes Claude and Claude 2.

There are other LLMs I won’t be discussing. The first is Pi, a chatbot built by Inflection. Pi is optimized for conversation, and really, really wants to be your friend (seriously, try it to see what I mean). It does not like to do much besides chat, and trying to get it to do work for you is an exercise in frustration. We also won’t cover the variety of open source models that anyone can use and modify. They are generally not accessible or useful for the casual user today, but have real promise. Future guides may include them.

So here is your quick reference chart, summarizing the state of LLMs:



The first four (including Bing) are all OpenAI systems. There are basically two major OpenAI AIs today: 3.5 and 4. The 3.5 model kicked off the current AI craze in November, the 4 model premiered in the Spring and is much more powerful. A new variation uses plugins to connect to the internet and other apps. There are a lot of plugins, most of which are not very useful, but you should feel free to explore them as needed. Code Interpreter as is an extremely powerful version of ChatGPT that can run Python programs. If you have never paid for OpenAI, you have only used 3.5. Aside from the plugins variation, and a temporarily suspended version of GPT-4 with browsing, none of these models are connected to the internet. Microsoft’s Bing uses a mix of 4 and 3.5, and is usually the first model in the GPT-4 family to roll out new features. For example, it can both create and view images, and it can read documents in the web browser. It is connected to the internet. Bing is a bit weird to use, but powerful.

Google has been testing its own AI for consumer use, which they call Bard, but which is powered by a variety of Foundation Models, most recently one called PaLM 2. For the company that developed LLM technology, they have been pretty disappointing, although improvements announced yesterday show they are still working on the underlying technology, so I have hope. It has already gained the capability to run limited code and interpret images, but I would generally avoid it for now.

The final company, Anthropic has released Claude 2. Claude is most notable for having a very large context window - essentially the memory of the LLM. Claude can hold almost an entire book, or many PDFs, in memory. It has been built to be less likely to act maliciously than other Large Language Models, which means, practically, that it tends to scold you a bit about stuff.

Now, on to some uses:

Best free options: Bing and Claude 2
Paid option: ChatGPT 4.0/ChatGPT with plugins

For right now, GPT-4 is still the most capable AI tool for writing, which you can access at Bing (select“creative mode”) for free or by purchasing a $20/month subscription to ChatGPT. Claude, however, is a close second, and has a limited free option available.

These tools are also being integrated directly into common office applications. Microsoft Office will include a copilot powered by GPT and Google Docs will integrate suggestions from Bard. The implications of what these new innovations mean for writing are pretty profound.

Here are some ways to use AI to help you write.

  • Writing drafts of anything. Blog posts, essays, promotional material, speeches, lectures, chose-you-own adventures, scripts, short stories - you name it, AI does it, and pretty well. All you have to do is prompt it. Prompt crafting is not magic, but basic prompts result in boring writing, but getting better at prompting is not that hard, just work interactively with the system. You will find AI systems to be much more capable as writers with a little practice.

  • Make your writing better. Paste your text into an AI. Ask it to improve the content, or for suggestions about how to make it better for a particular audience. Ask it to create 10 drafts in radically different styles. Ask it to make things more vivid, or add examples. Use it to inspire you to do better work.

  • Help you with tasks. AI can do things you don’t have the time to do. Use it like an intern to write emails, create sales templates, give you next steps in a business plan, and a lot more. Here is what I could accomplish with it in 30 minutes in supporting a product launch.

  • Unblock yourself. It is very easy to get distracted from a task by one difficult challenge. AI provides a way of giving yourself momentum.




Some things to worry about: In a bid to respond to your answers, it is very easy for the AI to “hallucinate” and generate plausible facts. It can generate entirely false content that is utterly convincing. Let me emphasize that: AI lies continuously and well. Every fact or piece of information it tells you may be incorrect. You will need to check it all. Particularly dangerous is asking it for references, quotes, citations, and information for the internet (for the models that are not connected to the internet). Bing will usually hallucinate less than other models, because GPT-4 is generally more grounded and because Bing’s internet connection means it can actually pull in relevant facts. Here is a guide to avoiding hallucinations, but they are impossible to completely eliminate.

And also note that AI doesn’t explain itself, it only makes you think it does. If you ask it to explain why it wrote something, it will give you a plausible answer that is completely made up. When you ask it for its thought process, is not interrogating its own actions, it is just generating text that sounds like it is doing so. This makes understanding biases in the system very challenging, even though those biases almost certainly exist.

It also can be used unethically to manipulate or cheat. You are responsible for the output of these tools.

  1. Stable Diffusion, which is open source and you can run from any high-end computer. It takes effort to get started, since you have to learn to craft prompts properly, but once you do it can produce great results. It is especially good for combining AI with images from other sources. Here is a nice guide to Stable Diffusion if you go that route (be sure to read both parts 1 and part 2).

  2. DALL-E, from OpenAI, which is incorporated into Bing (you have to use creative mode) and Bing image creator. This system is solid, but worse than Midjourney.

  3. Midjourney, which is the best system in mid-2023. It has the lowest learning-curve of any system: just type in "thing-you-want-to-see --v 5.2" (the --v 5.2 at the end is important, it uses the latest model) and you get a great result. Midjourney requires Discord. Here is a guide to using Discord.

  4. Adobe Firefly, built into a variety of Adobe products, but it lags DALL-E and Midjourney in terms of quality. However, while the other two models have been unclear about the source images that they used to train their AIs, Adobe has declared that it is only using images it has the right to use.

Here is how they compare (each image is labelled with the model):



Prompt: “Fashion photoshoot of sneakers inspired by Van Gogh” - the first images that were created by each model
Some things to worry about: These systems are built around models that have built-in biases due to their training on Internet data (if you ask it to create a picture of an entrepreneur, for example, you will likely see more pictures featuring men than women, unless you specify “female entrepreneur”), you can use this explorer to see these biases at work.

These systems are also trained on existing art on the internet in ways that are not transparent and potentially legally and ethically questionable. Though technically you own copyright of the images created, legal rules are still hazy.

Also, right now, they don’t create text, just a bunch of stuff that looks like text. But Midjourney has nailed hands.

Best free option: Bing
Paid option: ChatGPT 4.0, but Bing is likely better because of its internet connections

Despite of (or in fact, because of) all its constraints and weirdness, AI is perfect for idea generation. You often need to have a lot of ideas to have good ideas, and AI is good at volume. With the right prompting, you can also force it to be very creative. Ask Bing in creative mode to look up your favorite unusual idea generation techniques, like Brian Eno's oblique strategies or Mashall McLuhan's tetrads, and apply them. Or ask for something weird, like ideas inspired by a random patent, or your favorite superhero…



Best animation tool: D-iD for animating faces in videos. Runway v2 for creating videos from text
Best voice cloning: ElevenLabs

It is now trivial to generate a video with a completely AI generated character, reading a completely AI-written script, talking in an AI-made voice, animated by AI. It can also deepfake people, as you can see in this link where I deepfaked myself. Instructions and more information here. Use with caution, but this can be great for explainer videos and introductions.

The first commercially available text-to-video tool was also recently released, Runway v2. It creates short 4-second clips, and is more of a demonstration of what is to come, but is worth taking a look at if you want a sense of the future development in this space.

Some things to worry about: Deep fakes are a huge concern, and these systems need to be used ethically.

For data (And also any weird ideas you have with code): Code Interpreter
For documents: Claude 2 for large documents or many documents at once, Bing Sidebar for smaller documents and webpages (the sidebar, part of the Edge browsers can “see” what is in your browser, letting Bing work with that information, though the size of the context window is limited)

I wrote about Code Interpreter last week. It is a mode of GPT-4 that lets you upload files to the AI, allows the AI to write and run code, and lets you download the results provided by the AI. It can be used to execute programs, run data analysis (though you will need to know enough about statistics and data to check its work), and create all sorts of files, web pages, and even games. Though there has been a lot of debate since its release about the risks associated with untrained people using it for analysis, many experts testing Code Interpreter are pretty impressed, to the degree that one paper suggests it will require changing the way we train data scientists. Go to my previous post if you want more details on how to use it. I also made an initial prompt to set up Code Interpreter to create useful data visualizations. It gives it some basic principles of good chart design & also reminds it that it can output many kinds of files. You can find that here.

For working with text, and especially PDFs, Claude 2 is excellent so far. I have pasted in entire books into the previous version of Claude, with impressive results, and the new model is much stronger. You can see my previous experience, and some prompts that might be interesting to use, here. I also gave it numerous complex academic articles and asked it to summarize the results, and it does a good job! Even better, you can then interrogate the material by asking follow-up questions: what is the evidence for that approach? What do the authors conclude? And so on…



Some things to worry about: These systems still hallucinate, though in more limited ways. You need to check over their results if you want to ensure accuracy.

Best free option: Bing
Paid option: Usually Bing is best. For children, Khanmigo from Khan Academy offers good AI-driven tutoring powered by GPT-4.

If you are going to use AI as a search engine, probably don’t do that. The risk of hallucination is high and most AIs are not connected to the Internet, anyway (which is why I suggest you use Bing. Bard, Google’s AI, hallucinates much more). However, there is some evidence that AI can often provide more useful answers than search when used carefully, according to a recent pilot study. Especially in cases where search engines aren’t very good, like tech support, deciding where to eat, or getting advice, Bing is often better than Google as a starting point. This is an area that is evolving rapidly, but you should be careful about these uses for now. You don’t want to get in trouble.

But more exciting is the possibility of using AIs to help education, including helping us learn. I have written about how AI can be used for teaching and to help make teachers’ lives easier and their lessons more effective, but it can also work for self-guided learning as well. You can ask the AI to explain concepts and get ver good results. This prompt is a good automated tutor, and use can find a direct link to activate the tutor in ChatGPT here. Because we know the AI could be hallucinating, you would be wise to (carefully!) double-check any critical data against another source.

Thanks to rapid advances in technology, these are likely the worst AI tools you will ever use, as the past few months of development have shown. I have no doubt I will need to make a new guide soon. But remember two key points that remain true about AI:

  • AI is a tool. It is not always the right tool. Consider carefully whether, given its weaknesses, it is right for the purpose to which you are planning to apply it.

  • There are many ethical concerns you need to be aware of. AI can be used to infringe on copyright, or to cheat, or to steal the work of others, or to manipulate. And how a particular AI model is built and who benefits from its use are often complex issues, and not particularly clear at this stage. Ultimately, you are responsible for using these tools in an ethical manner.

We are in the early days of a very rapidly advancing revolution. Are there other uses you want to share? Let me know in the comments.



This post is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Share RecommendKeepReplyMark as Last Read


To: Kate Jenkins who wrote (366)5/17/2024 11:14:23 AM
From: lianass
   of 426
 
I

Share RecommendKeepReplyMark as Last Read


From: FJB7/4/2024 3:14:28 AM
   of 426
 
compression utility written in Haskell

lazamar.github.io

Share RecommendKeepReplyMark as Last Read


From: FJB7/21/2024 10:43:05 AM
   of 426
 
Intel Vs. Samsung Vs. TSMC


semiengineering.com

Ed Sperling

21–26 minutes





The three leading-edge foundries — Intel, Samsung, and TSMC — have started filling in some key pieces in their roadmaps, adding aggressive delivery dates for future generations of chip technology and setting the stage for significant improvements in performance with faster delivery time for custom designs.

Unlike in the past, when a single industry roadmap dictated how to get to the next process node, the three largest foundries increasingly are forging their own paths. They all are heading in the same general direction with 3D transistors and packages, a slew of enabling and expansive technologies, and much larger and more diverse ecosystems. But some key differences are emerging in their methodologies, architectures, and third-party enablement.

Roadmaps for all three show that transistor scaling will continue at least into the 18/16/14 angstrom range, with a possible move from nanosheets and forksheet FETs, followed by complementary FETs (CFETs) at some point in the future. The key drivers are AI/ML and the explosion of data that needs to be processed, and in most cases these will involve arrays of processing elements, usually with high levels of redundancy and homogeneity, in order to achieve higher yields.

In other cases, these designs may contain dozens or hundreds of chiplets, some engineered for specific data types and others for more general processing. Those chiplets can be mounted on a substrate in a 2.5D configuration, an approach that has gained traction in data centers because it simplifies the integration of high-bandwidth memory ( HBM), as well as in mobile devices, which also include other features such as image sensors, power supplies, and additional digital logic used for non-critical functions. All three foundries are working on full 3D-ICs, as well. And there will be hybrid options available, where logic is stacked on logic and mounted on a substrate, but separated from other features in order to minimize physical effects such as heat — a heterogeneous configuration that has been called both 3.5D and 5.5D.

Rapid and mass customization
One of the biggest changes involves bringing domain-specific designs to market much more quickly than in the past. Mundane as this may sound, it’s a competitive necessity for many leading-edge chips, and it requires fundamental changes in the way chips are designed, manufactured, and packaged. Making this scheme work demands a combination of standards, innovative connectivity schemes, and a mix of engineering disciplines that in the past had limited interactions, if any.

Sometimes referred to as “mass customization,” it includes the usual power, performance, and area/cost (PPA/C) tradeoffs, as well as rapid assembly options. That is the promise of heterogeneous chiplet assemblies, and from a scaling perspective it marks the next phase of Moore’s Law. The entire semiconductor ecosystem has been laying the groundwork for this shift incrementally for more than a decade.

But getting heterogeneous chiplets — essentially hardened IP from multiple vendors and foundries — to work together is both a necessary and daunting engineering challenge. The first step is connecting the chiplets together in a consistent way to achieve predictable results, and this is where the foundries have spent much of their effort, particularly with the Universal Chiplet Interconnect Express (UCIe) and Bunch of Wires (BoW) standards. While that connectivity is a critical requirement for all three, it’s also one of the main areas of divergence.

Intel Foundry’s current solution, prior to fully integrated 3D-ICs, is to develop what industry sources describe as “sockets” for chiplets. Instead of characterizing each chiplet for a commercial marketplace, the company defines the specification and the interface so that chiplet vendors can develop these limited-function mini-chips to meet those specs. That addresses one of the big stumbling blocks for a commercial chiplet marketplace. All the pieces need to work together, from data speed to thermal and noise management.

Intel’s scheme relies heavily on its Embedded Multi-Die Interconnect Bridge (EMIB), first introduced in 2014. “The really cool thing about an EMIB base is you can add any amount of chiplets,” said Lalitha Immaneni, vice president of technology development at Intel. “We don’t have a limitation on the number of IPs that we can use in design, and it won’t increase the interposer size, so it’s cost-effective and it’s agnostic of the process. We have given out a package assembly design kit, which is like your traditional PDK for the assembly. We give them the design rules, the reference flows, and we tell them the allowable constructions. It will also give them any collaterals that we need to take it into our assembly.”

Depending upon the design, there can be multiple EMIBs in a package, complemented by thermal interface materials (TIMs), in order to dissipate heat that can become trapped inside a package. TIMs typically are pads that are engineered to conduct heat away from the source, and they are becoming more common as the amount of compute inside a package increases and as the substrates are thinned to shorten the distance signals need to travel.

But the thinner the substrate, the less effective it is at heat dissipation, which can result in thermal gradients that are workload-dependent and therefore difficult to anticipate. Eliminating that heat may require TIMs, additional heat sinks, and potentially even more exotic cooling approaches such as microfluidics.

Both TSMC and Samsung offer bridges, as well. Samsung has embedded bridges inside the RDL — an approach it calls 2.3D or I-Cube ETM — and it’s using them to connect sub-systems to those bridges in order to speed time to working silicon. Instead of relying on a socket approach, some of the integration work will be pre-done in known-good modules.

“Putting together two, four, or eight CPUs into a system is something that very sophisticated customers know how to go out and do,” said Arm CEO Rene Haas, in a keynote speech at a recent Samsung Foundry event. “But if you want to build an SoC that has 128 CPUs attached to a neural network, memory structures, interrupt controllers that interface to an NPU, an off-chip bus to go to another chiplet, that is a lot of work. In the last year and a half, we’ve seen a rush of people building these complex SoCs wanting more from us.”

Samsung also has been building mini-consortia [1] of chiplet providers, targeted at specific markets. The initial concept is that one company builds an I/O die, another builds the interconnect, and a third builds the logic, and when that is proven to work, then others are added into the mix to provide more choices for customers.

TSMC has experimented with a number of different options, including both RDL and non-RDL bridges, fan-outs, 2.5D chip-on-wafer-on-substrate (CoWoS), and System On Integrated Chips (SoIC), a 3D-IC concept in which chiplets are packed and stacked inside a substrate using very short interconnects. In fact, TSMC has a process design kit for just about every application, and it has been active in creating assembly design kits for advanced packaging, including reference designs to go with them.

The challenge is that foundry customers willing to invest in these complex packages increasingly want very customized solutions. To facilitate that, TSMC rolled out a new language called 3Dblox, a top-down design scheme that fuses physical and connectivity constructs, allowing assertions to be applied across both. This sandbox approach allows customers to leverage any of its packaging approaches — InFO, CoWoS, and SoIC. It’s also essential to TSMC’s business model, because the company is the only pure-play foundry of the three [2] — although both Intel and Samsung have distanced their foundry operations in recent months.

“We started from a concept of modularization,” said Jim Chang, vice president of advanced technology and mask engineering at TSMC, in a presentation when 3Dblox was first introduced in 2023. “We can build a full 3D-IC stacking with this kind of language syntax plus assertions.”

Chang said the genesis of this was a lack of consistency between the physical and connectivity design tools. But he added that once this approach was developed, it also enabled reuse of chiplets in different designs because much of the characterization was already well-defined and the designs are modular.


Fig. 1: TSMC’s 3Dblox approach. Source: TSMC

Samsung followed with its own system description language, 3DCODE, in December 2023. Both Samsung and TSMC claim their languages are standards, but they’re more like new foundry rule decks because it’s unlikely these languages will be used outside of their own ecosystems. Intel’s 2.5D approach doesn’t require a new language because the rules are dictated by the socket specification, trading off some customization with a shortened time to market and a simpler approach for chiplet developers.

The chiplet challenge
Chiplets have obvious benefits. They can be designed independently at whatever process node makes sense, which is particularly important for analog features. But figuring out how to put the pieces together with predictable results has been a major challenge. The initial LEGO-like architecture scheme floated by DARPA has proven much more complicated than first envisioned, and it has required a massive and ongoing efforts by broad ecosystems to make it work.

Chiplets need to be precisely synchronized so that critical data is processed, stored, and retrieved without delay. Otherwise, there can be timing issues, in which one computation is either delayed or out-of-sync with other computations, leading to delays and potential deadlocks. In the context of mission- or safety-critical applications, the loss of a fraction of a second can have serious consequences.

Simplifying the design process, particularly with domain-specific designs where one size does not fit all, is an incredibly complex endeavor. The goal for all three foundries is to provide more options for companies that will be developing high-performance, low-power chips. With an estimated 30% to 35% of all leading-edge design starts now in the hands of large systems companies such as Google, Meta, Microsoft, and Tesla, the economics of leading-edge chip and package design have changed significantly, and so have the PPA/C formulas and tradeoffs.

Chips developed for these systems companies probably will not be sold commercially. So if they can achieve higher performance per watt, then the design and manufacturing costs can be offset by lower cooling power and higher utilization rates — and potentially fewer servers. The reverse is true for chips sold into mobile devices and commodity servers, where high development costs can be amortized across huge volumes. The economics for customized designs in advanced packages work for both, but for very different reasons.

Scaling down, up, and out
It’s assumed that within these complex systems of chiplets there will be multiple types of processors, some highly specialized and others more general-purpose. At least some of these will likely be developed at the most advanced process nodes due to limited power budgets. Advanced nodes still provide higher energy efficiency, which allows more transistors to be packed into the same area in order to improve performance. This is critical for AI/ML applications, where processing more data faster requires more multiply/accumulate operations in highly parallel configurations. Smaller transistors provide greater energy efficiency, allowing more processing per square millimeter of silicon, but the gate structure needs to be changed to prevent leakage, which is why forksheet FETs and CFETs are on the horizon.

Put simply, process leadership still has value. Being first to market with a leading-edge process is good for business, but it’s only one piece of a much larger puzzle. All three foundries have announced plans to push well into the angstrom range. Intel plans to introduce its 18A this year, followed by 14A a couple years later.


Fig. 2: Intel’s process roadmap. Source: Intel Foundry

TSMC, meanwhile, will add A16 in 2027 (see figure 3, below.)


Fig. 3: TSMC’s scaling roadmap into the angstrom era. Source: TSMC

And Samsung will push to 14 angstroms sometime in 2027 with its SF1.4, apparently skipping 18/16 angstroms. (See figure 4)


Fig. 4: Samsung’s process scaling roadmap. Source: Samsung Foundry

From a process node standpoint, all three foundries are on the same track. But advances are no longer tied to the process node alone. The focus increasingly is about latency and performance per watt in a specific domain, and this is where stacking logic-on-logic in a true 3D-IC configuration will excel, using hybrid bonds to connect chiplets to a substrate and each other. Moving electrons through a wire on a planar die is still the fastest (assuming a signal doesn’t have to travel from one end of the die to another), but stacking transistors on top of other transistors is the next best thing, and in some cases even better than a planar SoC because some vertical signal paths may be shorter.

In a recent presentation, Taejoong Song, Samsung Foundry’s vice president of foundry business development, showed a roadmap featuring logic-on-logic mounted on a substrate, combining a 2nm (SF2) die on top of a 4nm (SF4X) die, both mounted on top of another substrate. This is basically a 3D-IC on a 2.5D package, which is the 3.5D or 5.5D concept mentioned earlier. Song said the foundry will begin stacking an SF1.4 on top of SF2P, starting in 2027. What’s particularly attractive about this approach are the thermal dissipation possibilities. With the logic separated from other functions, heat can be channeled away from the stacked dies through the substrate or any of the five exposed sides.


Fig. 5: Samsung’s 3D-IC architecture for AI. Source: Samsung

Intel, meanwhile, will leverage its Foveros Direct 3D to stack logic on logic, either face-to-face or face-to-back. The approach allows chips or wafers from different foundries, with the connection bandwidth determined by the copper via pitch, according to a new Intel white paper. The paper noted that the first version would use a copper pitch of 9µm, while the second generation would use a 3µm pitch.


Fig. 6: Intel’s Foveros Direct 3D. Source: Intel

“The true 3D-IC comes with Foveros, and then also with hybrid bonds,” said Intel’s Immaneni. “You cannot go in the tradition route of design where you put it together and run validation, and then find, ‘Oops, I have an issue.’ You cannot afford to do this anymore because you’re impacting your time to market. So you really want to provide a sandbox to make it predictable. But even before I step into this detailed design environment, I want to run my mechanical/electrical/thermal analysis. I want to look at the connectivity so I don’t have opens and shorts. The burden for 3D-IC resides more in the code design than the execution.”

Foveros allows an active logic die to be stacked on either another active or passive die, with the base die used to connect all the die in a package at a 36 micron pitch. By leveraging advanced sort, Intel claims it can guarantee 99% known good die, and 97% yield at post-assembly test.

TSMC’s CoWoS, meanwhile, already is in use by NVIDIA and AMD for their advanced packaging for AI chips. CoWoS is essentially a 2.5D approach, using an interposer to connect SoCs and HBM memory using through-silicon vias. The company’s plans for SoIC are more ambitious, packaging both memory on logic along with other elements, such as sensors, in a 3D-IC at the front end of the line. This can significantly reduce assembly time of multiple layers, sizes, and functions. TSMC contends that its bonding scheme enables faster and shorter connections than other 3D-IC approaches. One report said Apple will begin using TSMC’s SoIC technology starting next year, while AMD will expand its use of this approach.

Other innovations
Putting the process and packaging technology in place opens the door to a much broader set of competitive options. Unlike in the past, when big chipmakers, equipment vendors, and EDA companies defined the roadmap for chips, the chiplet world provides the tools for end customers to make those decisions. This is due, in no small part, to the number of features that can be put into a package versus those that can fit inside the reticle limits of an SoC. Packages can be expanded horizontally or vertically, as needed, and in some cases they can improve performance just through vertical floor-planning.

But given the vast opportunity in the cloud and the edge — particularly with the rollout of AI everywhere — the three big foundries, as well as their ecosystems, are racing to developing new capabilities and features. In some cases, this involves leveraging what they already have. In other cases, it requires brand new technologies.

For example, Samsung has started detailing plans about custom HBM, which includes 3D DRAM stacks with a configurable logic layer underneath. This is the second time around for this approach. Back in 2011, Samsung and Micron co-developed the Hybrid Memory Cube, packaging a DRAM stack on a layer of logic. HBM won the war after JEDEC turned it into a standard, and HMC largely disappeared. But there was nothing wrong with the HMC approach, other than perhaps bad timing.

In its new form, Samsung plans to offer customized HBM as an option. Memory is one of the key elements that determine performance, and the ability to read/write and move data back and forth more quickly between memory and processors can have a big impact on performance and power. And those numbers can be significantly better if the memory is right-sized to a specific workload or data type, and if some of the processing can be done inside the memory module so there is less data to move.


Fig. 7: Samsung roadmap and innovations. Source: Semiconductor Engineering/MemCon 2024

Intel, meanwhile, has been working on a better way to deliver power to densely packed transistors, a persistent problem as the transistor density and number of metal layers increases. In the past, power was delivered from the top of the chip down, but two problems have emerged at the most advanced nodes. One is the challenge of actually delivering enough power to every transistor. The second is noise, which can come from power, substrates, or electromagnetic interference. Without proper shielding — something that is becoming more difficult at each new node due to thinner dielectrics and wires — that noise can impact signal integrity.

Delivering power through the backside of a chip minimizes those kinds of issues and reduces wiring congestion. But it also adds other challenges, such as how to drill holes through a thinner substrate without structural damage. Intel apparently has solved these issues, with plans to offer its PowerVia backside power scheme this year.

TSMC said it plans to deliver backside power delivery at A16 in 2026/2027. Samsung is roughly on the same schedule, delivering it in the SF2Z 2nm process.

Intel also has announced plans for glass substrates, which can provide better planarity and lower defectivity than CMOS. This is especially important at advanced nodes, where even nano-sized pits can cause issues. As with backside power delivery, handling issues abound. The upside is that glass has the same coefficient of thermal expansion as silicon, so it is compatible with the expansion and contraction of silicon components, such as chiplets. After years of sitting on the sidelines, glass is suddenly very attractive. In fact, both TSMC and Samsung are working on glass substrates, as well, and the whole industry is starting to design with glass, handle it without cracking it, and to inspect it.

TSMC, meanwhile, has focused heavily on building an ecosystem and expanding its process offerings. Numerous industry sources say TSMC’s real strength is the ability to deliver process development kits for just about any process or package. The foundry produces about 90% of the most advanced chips globally, according to Nikkei. It also has the most experience with advanced packaging of any foundry, and the largest and broadest ecosystem, which is important.

That ecosystem is critical. The chip industry is so complex and varied that no single company can do everything. The question going forward will be how complete those ecosystems truly are, particularly if the number of processes continues to grow. For example, EDA vendors are essential enablers, and for any process or packaging approach to be successful, design teams need automation. But the more processes and packaging options, the more difficult it will be for EDA vendors to support every incremental change or improvement, and potentially the greater the lag time between announcement and delivery.

Conclusion
The recent supply chain glitches and geopolitics have convinced the United States and Europe that they need to re-shore and “friend-shore” manufacturing. The investments in semiconductor fabs, equipment, tools, and research are unprecedented. How that affects the three largest foundries remains to be seen, but it certainly is providing some of the impetus behind new technologies such as co-packaged optics, a raft of new materials, and cryogenic computing.

The impact of all of these changes on market share is becoming harder to track. It’s no longer about which foundry is producing chips at the smallest process node, or even the number of chips being shipped. A single advanced package may have dozens of chiplets. The real key is the ability to deliver solutions that matter to customers, quickly and efficiently. In some cases the driver will be performance per watt, while in others it may be time to results with power as a secondary consideration. And in still others, it may be a combination of features that only one of the leading-edge foundries can provide in sufficient quantity. But what is clear is that the foundry race is significantly more complex than ever before, and becoming more so. In this highly complex world, simple metrics for comparison no longer apply.

References
1. Mini-Consortia Forming Around Chiplets, March 20, 2023; E. Sperling/Semiconductor Engineering
2. TSMC also is the largest shareholder (35%) in Global Unichip Corp., a design services company.






Share RecommendKeepReplyMark as Last Read


From: FJB7/22/2024 2:41:11 PM
   of 426
 

Share RecommendKeepReplyMark as Last Read


From: FJB2/23/2025 3:38:58 PM
   of 426
 
Microsoft’s New Quantum Computer, Summed Up In 3 Words
Trevor Filseth

nationalinterest.org

By creating a new state of matter, Microsoft’s engineers have ensured their quantum computer is truly one-of-a-kind. Everyone is fixated on the race for artificial intelligence dominance. Few, however, are taking the quest for quantum supremacy seriously. They should not lose sight of this—especially because Microsoft has just made what they claim to be a significant breakthrough in the mission to be the leader of the quantum computing revolution.

Microsoft’s “Majorana 1” Quantum Chip It’s called “Majorana 1,” and Microsoft says it is the world’s first Quantum Processing Unit (QPU) powered by what’s known as a topological core, which is designed to scale to a million qubits on a single chip. And the chip in question—what Microsoft calls a “topoconductor,” short for topological conductor—can fit into the palm of your hand.

The “Majorana 1” device gets its name from “Majorana Zero Modes” (MZMs). An MZM is a unique quantum particle that exists at the edges of certain materials (such as what the Majorana 1 is made of). They must exist in a state of absolute zero to operate. What’s more, they allow for rapid processing of highly complex problems at a very low error rate.

That’s the key aspect of quantum computing that people don’t seem to understand. It’s the speed of the processing. Quantum computers can solve highly complex problems very quickly. Specifically, it can solve the kinds of complex problems that traditional supercomputers either cannot solve or would take far too long to resolve.

Understanding the Physics Quantum computers work on an entirely different level of physics than do conventional computers.

Classical computers operate on binary bits. The quantum computer, however, works according to “quantum bits” (or “qubits”). Whereas binary bits can either be one or zero, a qubit can exist in multiple states. In other words, that zero and one can exist either separately, as one or zero, or—and here’s where things get weird for most people—the zero or one can exist simultaneously.

This is where you start to see people on The Joe Rogan Experience talking about quantum computing being the gateway for peering into multiple universes. That’s because, in order to quickly resolve a complex problem set a quantum computer is presented with, the quantum computer essentially looks at all possibilities and then seeks a resolution based on the best probable result.

And this is where things get dicey for the scientists working on quantum computers. Ordinarily, quantum computers have a high error rate.

The Microsoft team that has developed Majorana 1 says they’ve created a device that has reliable “quantum error correction” (QEC). Essentially, Microsoft claims that they’ve created a fault-tolerant quantum computer. Being fault tolerant is key.

Majorana-1 can continue functioning accurately even though occasional errors in qubits and gates will arise. One of the ways that Microsoft has ensured their new system has a relatively low QEC (around one percent error rate, which the engineers think they can reduce more over time), is via digital control that allows for computer engineers “to manage the large numbers of qubits needed for real-world applications.”

Low Error Rates Microsoft claims to have achieved topoconductor superconductivity—meaning they’ve created an entirely new class of material, which is what separates the Majorana-1 from other quantum computers today. This new material is what allows Microsoft to have a digitally controlled, small, and very fast qubit running its quantum computer.

By creating a new state of matter, Microsoft’s engineers have ensured their quantum computer is truly one-of-a-kind. That, in turn, likely means that Microsoft (at least for now) has quantum supremacy over its rivals.

Microsoft’s Quantum Chip: A Technological Breakthrough Quantum computing is suddenly all over the news. That’s likely because the tech sector is increasingly consumed with the objective of developing artificial intelligence. For AI to work well, it needs massive amounts of energy, data, and processing power. Quantum computers will give AI the processing power it needs to be truly dominant, if the engineers can make quantum computing viable and scalable. It appears that Microsoft has taken the first step towards doing that.

Of course, there are detractors. Some experts argue that it’s all hype—not just what Microsoft is saying but what many of these quantum computing firms are claiming to have achieved in their research and development.

DARPA’s Role The fact that the Microsoft program was “part of the final phase of the Defense Advanced Research Projects Agency (DARPA) Underexplored Systems for Utility-Scale Quantum Computer (US2QC) program,” as reported by Microsoft itself, means that this is not just a baseless claim meant to generate buzz for the tech company.

After all, DARPA is the group that has had its hand in some of the country’s most significant scientific and technological breakthroughs (most notably the Internet).

Lastly, the role of DARPA should not be overlooked because of the obvious national security implications (and complications) that quantum computing poses. Notably, modern encryption techniques can be easily hacked by quantum computers—as China’s quantum computer alarmingly demonstrated recently. Further, if paired to AI, an effective and scalable quantum computer could be lethal on the future battlefield.

Don’t forget, too, that, as cryptocurrencies are taking off, some security analysts fear that quantum computers could hack the blockchain technology that undergirds cryptocurrencies.

All these developments point to not only a revolution of AI, but a concomitant revolution in quantum computing—something that Microsoft itself has said they are spearheading.

So, even amid the other revolutionary advances in technology that have come to the fore this decade, keep your eye out for quantum computing. It’s coming sooner than most want to admit.

Brandon J. Weichert, a Senior National Security Editor at The National Interest as well as a Senior Fellow at the Center for the National Interest, and a contributor at Popular Mechanics, consults regularly with various government institutions and private organizations on geopolitical issues. Weichert’s writings have appeared in multiple publications, including the Washington Times, National Review, The American Spectator, MSN, the Asia Times, and countless others. His books include Winning Space: How America Remains a Superpower, Biohacked: China’s Race to Control Life, and The Shadow War: Iran’s Quest for Supremacy. His newest book, A Disaster of Our Own Making: How the West Lost Ukraine is available for purchase wherever books are sold. He can be followed via Twitter @WeTheBrandon.

Image: Shutterstock.





Share RecommendKeepReplyMark as Last Read
Previous 10