SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.

   Technology StocksNew Technology


Previous 10 Next 10 
From: FJB1/19/2022 6:47:06 PM
   of 425
 
Security Engineering Lecture 1: Who is the Opponent?


Share RecommendKeepReplyMark as Last Read


From: FJB3/21/2022 8:44:32 PM
   of 425
 
Lanai, the mystery CPU architecture in LLVM.


Disclaimer: I have had access to some confidential information about some of the matter discussed in this page. However, everything written here is derived form publicly available sources, and references to these sources are also provided.


https://q3k.org/lanai.html
https://q3k.org/lanai.html

Some of my recent long-term projects revolve around a little known CPU architecture called 'Lanai'. Unsurprisingly, very few people have heard of it, and even their Googling skills don't come in handy. This page is a short summary of what I know, and should serve as a reference for future questions.

Myricom & the origins of Lanai Myricom is a hardware company founded in 1994. One of their early products was a networking interface card family and protocol, Myrinet. I don't know much about it, other than it did some funky stuff with wormhole routing.

As part of their network interface card design, they introduced data plane programmability with the help of a small RISC core they named LANai. It originally ran at 33MHz, the speed of the PCI bus on which the cards were operating. These cores were quite well documented on the Myricom website, seemingly with the end-user programmability being a selling point of their devices.

It's worth noting that multiple versions of LANai/Lanai have been released. The last publicly documented version on the old Myricom website is Lanai3/4. Apart from the documentation, sources for a gcc/binutils fork exist to this day on Myricom's Github.

At some point, however, Myricom stopped publicly documenting the programmability of their network cards, but documentation/SDK was still available on request. Some papers and research websites actually contain tutorials on how to get running with the newest versions of the SDK at the time, and even document the differences between the last documented Lanai3/4 version and newer releases of the architecture/core.

This closing down of the Lanai core documentation by Myricom didn't mean they stopped using it in their subsequent cards. The core made its way into their Ethernet offerings (after Myrinet basically died), like their 10GbE network cards. You can easily find these 10G cards on eBay, and they even have the word 'Lanai' written on their main ASIC package. Even more interestingly, Lanai binaries are shipped with Linux firmware packages, and can be chucked straight into a Lanai disassembler (eg. the Myricom binutils fork's objdump).

Technical summary of Lanai3/4
  • 32 registers, most of them general purpose, with special treatment for R0 (all zeroes), R1 (all ones), R2 (the program counter), R3 (status register), and some registers allocated for mode/context switching.
  • 4-stage RISC-style pipeline: Calculate Address, Fetch, Compute, Memory
  • Delay slot based pipeline hazard resolution
  • No multiplication, no division. It's meant to route packets, not crunch numbers.
  • The world's best instruction mnemonic: PUNT, to switch between user and system contexts.
Here's a sample of Lanai assembly:

000000f8 <main>:       f8: 92 93 ff fc   st      %fp, [--%sp]       fc: 02 90 00 08   add     %sp, 0x8, %fp      100: 22 10 00 08   sub     %sp, 0x8, %sp      104: 51 80 00 00   or      %r0, 0x0, %r3      108: 04 81 40 01   mov     0x40010000, %r9      10c: 54 a4 08 0c   or      %r9, 0x80c, %r9      110: 06 01 11 11   mov     0x11110000, %r12      114: 56 30 11 11   or      %r12, 0x1111, %r12      118: 96 26 ff f4   st      %r12, -12[%r9]      11c: 96 26 ff f8   st      %r12, -8[%r9]      120: 86 26 13 f8   ld      5112[%r9], %r12  00000124 <.LBB3_1>:      124: 46 8d 00 00   and     %r3, 0xffff, %r13      128: 96 a4 00 00   st      %r13, 0[%r9]      12c: 01 8c 00 01   add     %r3, 0x1, %r3      130: e0 00 01 24   bt      0x124 <.LBB3_1>      134: 96 24 00 00   st      %r12, 0[%r9] 
The `add`/`sub`/`or` instruction have their destination on the right hand side. `st` and `ld` are memory store and load instructions respectively. Note the lack of 32-bit immediate load (instead a `mov` and `or` instruction are used in tandem). That `mov` instruction isn't real, either - it's a pseudo instruction for an `add 0, 0x40010000, %r9`. Also note the branch delay slot at address 134 (this instruction gets executed even if the branch at 130 is taken).

The ISA is quite boring, and in my opinion that's a good thing. It makes core implementations easy and fast, and it generally feels like one of the RISC-iest cores I've dealt with. The only truly interesting thing about it is its' dual-context execution system, but that unfortunately becomes irrelevant at some point, as we'll see later.

Google & the Lanai team In the early 2010s, things weren't going great at Myricom. Due to financial and leadership difficulties, some of their products got canceled, and in 2013, core Myricom engineers were bought out by Google, and they transferred the Lanai intellectual property rights with them. The company still limps on, seemingly targeting the network security and fintech markets, and even continuing to market their networking gear as programmable, but Lanai is nowehere to be seen in their new designs.

So what has Google done with the Lanai engineers and technology? The only thing we know is that in 2016 Google implemented and upstreamed a Lanai target in LLVM, and that it was to be used internally at Google. What is it used for? Only Google knows, and Google isn't saying.

The LLVM backend targets Lanai11. This is quite a few numbers higher than the last publicly documented Lanai3/4, and there's quite a few differences between them:

  1. No more dual-context operation, no more PUNT instruction. The compiler/programmer can now make use of nearly all registers from r4 to r31.
  2. No more dual-ALU (R-R-R) instructions. This was obviously slow, and was probably a combinatorial bottleneck in newer microarchitectural implementations.
  3. Slightly different delay slot semantics, pointing at a new microarchitecture (likely having stepped away from a classic RISC pipeline into something more modern).
  4. New additional instruction format and set of accompanying instructions: SPLS (special part-word load/store), SLI (special load immediate), and Special Instruction (containing amongst others popcount, of course).
Lanai Necromancy As you can tell by this page, this architecture intrigued me. The fact that it's an LLVM target shipped with nearly every LLVM distribution while no-one has access to hardware which runs the emitted code is just so spicy. Apart from writing this page, I have a few other Lanai-related projects, and I'd like to introduce them here:

  1. I'm porting Rust to Lanai11. I have a working prototype, which required submitting some patches to upstream LLVM to deal with IR emitted by rustc. This has been upstreamed. My rustc patches are pending on...
  2. I'm implementing LLD support for Lanai. Google (in the LLVM mailing list posts) mentions they use a binutils ld, forked off from the Myricom binutils fork. I've instead opted to implement an LLD backend for Lanai, which currently only supports the simplest relocations. I haven't yet submitted a public LLVM change request for this, but this is on my shortlist of things to do. I have to first talk to the LLVM/Google folks on the maintenance plan for this.
  3. I've implemented a simple Lanai11 core in Bluespec, as part of my qfc monorepo. 3-stage pipeline (merged addr/fetch stages), in-order. It's my first bit of serious Bluespec code, so it's not very good. I plan on implementing a better core at some point.
  4. I've implemented a small Lanai-based microcontroller, qf105, which is due to be manufactured in 130nm as part of the OpenMPW5 shuttle. Which is, notably, sponsored by Google :).
If you're interested in following or joining these efforts, hop on to ##q3k on libera.chat.

In addition to my effort piecing together information about Lanai and making use of it for my own needs, the TrueBit project also used it as a base for their smart contract system (in which they implemented a Lanai interpreter in Solidity).

Documentation Useful resources, in no particular oder:


Share RecommendKeepReplyMark as Last Read


From: FJB3/25/2022 5:18:32 PM
   of 425
 
Writing a Simple Operating System —
from Scratch

cs.bham.ac.uk

Share RecommendKeepReplyMark as Last Read


From: retrodynamic4/17/2022 9:58:48 PM
   of 425
 
State of the Art Novel InFlow Tech: ·1-Gearturbine Reaction Turbine Rotary Turbo, ·2-Imploturbocompressor Impulse Turbine 1 Compression Step.



·1-Gearturbine: Reaction Turbine, ·Rotary-Turbo, Similar System of the Aeolipilie ·Heron Steam Device from 10-70 AD, ·With Retrodynamic = DextroGiro/RPM VS LevoGiro/InFlow, + ·Ying Yang Circular Power Type, ·Non Waste Parasitic Power Looses Type, ·8-X,Y Thermodynamic Cycle Way Steps, Patent: #197187 / IMPI - MX.



·2-Imploturbocompressor: Impulse Turbine, ·Implo-Ducted, One Moving Part System Excellence Design, · InFlow Goes from Macro-Flow to Micro-Flow by Implosion/And Inverse, ·One Compression Step, ·Circular Dynamic Motion. Implosion Way Type, ·Same Nature of a Hurricane Satellite View.

stateoftheartnovelinflowtech.blogspot.com

https://padlet.com/gearturbine/un2slbar3s94

https://www.behance.net/gearturbina61a


Share RecommendKeepReplyMark as Last Read


From: FJB4/20/2022 7:03:09 AM
   of 425
 


hpcwire.com

Nvidia R&D Chief on How AI is Improving Chip Design
By John Russell

8-10 minutes





Getting a glimpse into Nvidia’s R&D has become a regular feature of the spring GTC conference with Bill Dally, chief scientist and senior vice president of research, providing an overview of Nvidia’s R&D organization and a few details on current priorities. This year, Dally focused mostly on AI tools that Nvidia is both developing and using in-house to improve its own products – a neat reverse sales pitch if you will. Nvidia has, for example begun using AI to effectively improve and speed GPU design.

Bill Dally of Nvidia in his home ‘workshop’ “We’re a group of about 300 people that tries to look ahead of where we are with products at Nvidia,” described Dally in his talk this year. “We’re sort of the high beams trying to illuminate things in the far distance. We’re loosely organized into two halves. The supply half delivers technology that supplies GPUs. It makes GPUs themselves better, ranging from circuits, to VLSI design methodologies, architecture networks, programming systems, and storage systems that go into GPUs and GPUs systems.”

“The demand side of Nvidia research tries to drive demand for Nvidia products by developing software systems and techniques that need GPUs to run well. We have three different graphics research groups, because we’re constantly pushing the state of the art in computer graphics. We have five different AI groups, because using GPUs to run AI is currently a huge thing and getting bigger. We also have groups doing robotics and autonomous vehicles. And we have a number of geographically ordered oriented labs like our Toronto and Tel Aviv AI labs,” he said.

Occasionally, Nvidia launches a Moonshot effort pulling from several groups – one of these, for example, produced Nvidia’s real-time ray tracing technology.

As always, there was overlap with Dally’s prior-year talk – but there was also new information. The size of the group has certainly grown from around 175 in 2019. Not surprisingly, efforts supporting autonomous driving systems and robotics have intensified. Roughly a year ago, Nvidia recruited Marco Pavone from Stanford University to lead its new autonomous vehicle research group, said Dally. He didn’t say much about CPU design efforts, which are no doubt also intensifying.



Presented here are small portions of Dally’s comments (lightly edited) on Nvidia’s growing use of AI in designing chips along a with a few supporting slides.

1 Mapping Voltage Drop

“It’s natural as an expert in AI that we would want to take that AI and use it to design better chips. We do this in a couple of different ways. The first and most obvious way is we can take existing computer-aided design tools that we have [and incorporate AI]. For example, we have one that takes a map of where power is used in our GPUs, and predicts how far the voltage grid drops – what’s called IR drop for current times resistance drop. Running this on a conventional CAD tool takes three hours,” noted Dally.

“Because it’s an iterative process, that becomes very problematic for us. What we’d like to do instead is train an AI model to take the same data; we do this over a bunch of designs, and then we can basically feed in the power map. The [resulting] inference time is just three seconds. Of course, it’s 18 minutes if you include the time for feature extraction. And we can get very quick results. A similar thing in this case, rather than using a convolutional neural network, we use a graph neural network, and we do this to estimate how often different nodes in the circuit switch, and this actually drives the power input to the previous example. And again, we’re able to get very accurate power estimations much more quickly than with conventional tools and in a tiny fraction of the time,” said Dally.





2 Predicting Parasitics

“One that I particularly like – having spent a fair amount of time a number of years ago as a circuit designer – is predicting parasitics with graph neural networks. It used to be that circuit design was a very iterative process where you would draw a schematic, much like this picture on the left here with the two transistors. But you wouldn’t know how it would perform until after a layout designer took that schematic and did the layout, extracted the parasitics, and only then could you run the circuit simulations and find out you’re not meeting some specifications,” noted Dally.

“You’d go back and modify your schematic [and go through] the layout designer again, a very long and iterative and inhuman labor-intensive process. Now what we can do is train neural networks to predict what the parasitics are going to be without having to do layout. So, the circuit designer can iterate very quickly without having that manual step of the layout in the loop. And the plot here shows we get very accurate predictions of these parasitics compared to the ground truth.”



3 Place and Routing Challenges

“We can also predict routing congestion; this is critical in the layout of our chips. The normal process is we would have to take a net list, run through the place and route process, which can be quite time consuming often taking days. And only then we would get the actual congestion, finding out that our initial placement is not adequate. We need to refactor it and place the macros differently to avoid these red areas (slide below), which is where there’s too many wires trying to go through a given area, sort of a traffic jam for bits. What we can do instead now is without having to run the place and route, we can take these net lists and using a graph neural network basically predict where the congestion is going to be and get fairly accurate. It’s not perfect, but it shows the areas where there are concerns, we can then act on that and do these iterations very quickly without the need to do a full place and route,” he said.



4 Automating Standard Cell Migration

“Now those [approaches] are all sort of using AI to critique a design that’s been done by humans. What’s even more exciting is using AI to actually do the design. I’ll give you two examples of that. The first is a system we have called NVCell, which uses a combination of simulated annealing and reinforcement learning to basically design our standard cell library. So each time we get a new technology, say we’re moving from a seven nanometer technology to a five nanometer technology, we have a library of cells. A cell is something like an AND gate and OR gate, a full adder. We’ve got actually many thoundands of these cells that have to be redesigned in the new technology with a very complex set of design rules,” said Dally.

“We basically do this using reinforcement learning to place the transistors. But then more importantly, after they’re placed, there are usually a bunch of design rule errors, and it goes through almost like a video game. In fact, this is what reinforcement learning is good at. One of the great examples is using reinforcement learning for Atari video games. So this is like an Atari video game, but it’s a video game for fixing design rule errors in a standard cell. By going through and fixing these design rule errors with reinforcement learning, we’re able to basically complete the design of our standard cells. What you see (slide) is that the 92 percent of the cell library was able to be done by this tool with no design rule or electrical rule errors. And 12 percent of them are smaller than the human design cells, and in general, over the cell complexity, [this tool] does as well or better than the human design cells,” he said.

“This does two things for us. One is it’s a huge labor savings. It’s a group on the order of 10 people will take the better part of a year to port a new technology library. Now we can do it with a couple of GPUs running for a few days. Then the humans can work on those 8 percent of the cells that didn’t get done automatically. And in many cases, we wind up with a better design as well. So it’s labor savings and better than human design.”





There was a good deal more to Dally’s talk, all of it a kind of high-speed dash through a variety of Nvidia’s R&D efforts. If you’re interested, here is HPCwire’s coverage of two previous Dally R&D talks – 2019, 2021 – for a rear-view mirror into work that may begin appearing in products. As a rule, Nvidia’s R&D is very product-focused rather than basic science. You’ll note his description of the R&D mission and organization hasn’t changed much but the topics are different.


Share RecommendKeepReplyMark as Last Read


From: FJB6/7/2022 9:29:44 PM
   of 425
 
Miracle Drug Shows 100% Remission For All Cancer Patients In Drug Trial

Share RecommendKeepReplyMark as Last Read


From: FJB6/13/2022 11:54:24 PM
   of 425
 
so awesome

Diving into GCC internals

Share RecommendKeepReplyMark as Last Read


From: FJB9/27/2022 7:06:20 AM
   of 425
 
Weave Ignite

github.com


Weave Ignite is an open source Virtual Machine (VM) manager with a container UX and built-in GitOps management.

  • Combines Firecracker MicroVMs with Docker / OCI images to unify containers and VMs.
  • Works in a GitOps fashion and can manage VMs declaratively and automatically like Kubernetes and Terraform.
  • Ignite is fast and secure because of Firecracker. Firecracker is an open source KVM implementation from AWS that is optimised for high security, isolation, speed and low resource consumption. AWS uses it as the foundation for their serverless offerings (AWS Lambda and Fargate) that need to load nearly instantly while also keeping users isolated (multitenancy). Firecracker has proven to be able to run 4000 micro-VMs on the same host!

    What is Ignite? Read the announcement blog post here: weave.works

    Ignite makes Firecracker easy to use by adopting its developer experience from containers. With Ignite, you pick an OCI-compliant image (Docker image) that you want to run as a VM, and then just execute ignite run instead of docker run. There’s no need to use VM-specific tools to build .vdi, .vmdk, or .qcow2 images, just do a docker build from any base image you want (e.g. ubuntu:18.04 from Docker Hub), and add your preferred contents.

    When you run your OCI image using ignite run, Firecracker will boot a new VM in about 125 milliseconds (!) for you using a default 4.19 Linux kernel. If you want to use some other kernel, just specify the --kernel-image flag, pointing to another OCI image containing a kernel at /boot/vmlinux, and optionally your preferred modules. Next, the kernel executes /sbin/init in the VM, and it all starts up. After this, Ignite connects the VMs to any CNI network, integrating with e.g. Weave Net.

    Ignite is a declarative Firecracker microVM administration tool, similar to how Docker manages runC containers. Ignite runs VM from OCI images, spins VMs up/down at lightning speed, and can manage fleets of VMs efficiently using GitOps.

    The idea is that Ignite makes Firecracker VMs look like Docker containers. Now we can deploy and manage full-blown VM systems just like e.g. Kubernetes workloads. The images used are OCI/Docker images, but instead of running them as containers, it executes their contents as a real VM with a dedicated kernel and /sbin/init as PID 1.

    Networking is set up automatically, the VM gets the same IP as any container on the host would.

    And Firecracker is fast! Building and starting VMs takes just some fraction of a second, or at most some seconds. With Ignite you can get started with Firecracker in no time!

    Use-cases With Ignite, Firecracker is now much more accessible for end users, which means the ecosystem can achieve a next level of momentum due to the easy onboarding path thanks to the docker-like UX.

    Although Firecracker was designed with serverless workloads in mind, it can equally well boot a normal Linux OS, like Ubuntu, Debian or CentOS, running an init system like systemd.

    Having a super-fast way of spinning up a new VM, with a kernel of choice, running an init system like systemd allows running system-level applications like the kubelet, which need to “own” the full system.

    Example use-cases:

  • Set up many secure VMs lightning fast. It's great for testing, CI and ephemeral workloads.
  • Launch and manage entire “app ready” stacks from Git because Ignite supports GitOps!
  • Run even legacy or special apps in lightweight VMs (eg for multi-tenancy, or using weird/edge kernels).
  • And - potentially - we can run a cloud of VMs ‘anywhere’ using Kubernetes for orchestration, Ignite for virtualization, GitOps for management, and supporting cloud native tools and APIs.

    Scope Ignite is different from Kata Containers or gVisor. They don’t let you run real VMs, but only wrap a container in a VM layer providing some kind of security boundary (or sandbox).

    Ignite on the other hand lets you run a full-blown VM, easily and super-fast, but with the familiar container UX. This means you can “move down one layer” and start managing your fleet of VMs powering e.g. a Kubernetes cluster, but still package your VMs like containers.

    Installing Please check out the Releases Page.

    How to install Ignite is covered in docs/installation.md or on Read the Docs.

    Guidance on Cloud Providers' instances that can run Ignite is covered in docs/cloudprovider.md.

    Getting Started WARNING: In it's v0.X series, Ignite is in alpha, which means that it might change in backwards-incompatible ways.



    Note: At the moment ignite and ignited need root privileges on the host to operate due to certain operations (e.g. mount). This will change in the future.


    # Let's run the weaveworks/ignite-ubuntu OCI image as a VM # Use 2 vCPUs and 1GB of RAM, enable automatic SSH access and name it my-vm ignite run weaveworks/ignite-ubuntu \     --cpus 2 \     --memory 1GB \     --ssh \     --name my-vm  # List running VMs ignite ps  # List Docker (OCI) and kernel images imported into Ignite ignite images ignite kernels  # Get the boot logs of the VM ignite logs my-vm  # SSH into the VM ignite ssh my-vm  # Inside the VM you can check that the kernel version is different, and the IP address came from the container # Also the memory is limited to what you specify, as well as the vCPUs > uname -a > ip addr > free -m > cat /proc/cpuinfo  # Rebooting the VM tells Firecracker to shut it down > reboot  # Cleanup ignite rm my-vm

    For a walkthrough of how to use Ignite, go to docs/usage.md.

    Getting Started the GitOps way Ignite is a “GitOps-first” project, GitOps is supported out of the box using the ignited gitops command. Previously this was integrated as ignite gitops, but this functionality has now moved to ignited, Ignite's upcoming daemon binary.

    In Git you declaratively store the desired state of a set of VMs you want to manage. ignited gitops reconciles the state from Git, and applies the desired changes as state is updated in the repo. It also commits and pushes any local changes/additions to the managed VMs back to the repository.

    This can then be automated, tracked for correctness, and managed at scale - just some of the benefits of GitOps.

    The workflow is simply this:

  • Run ignited gitops [repo], where repo is an SSH url to your Git repo
  • Create a file with the VM specification, specifying how much vCPUs, RAM, disk, etc. you’d like for the VM
  • Run git push and see your VM start on the host
  • See it in action! (Note: The screencast is from an older version which differs somewhat)


    Share RecommendKeepReplyMark as Last Read


    From: FJB9/28/2022 6:02:18 PM
       of 425
     
    NEXT WAVE IS MOVING AWAY FROM CLOUD. IT IS KIND OF A RIP OFF...

    levelup.gitconnected.com

    How we reduced our annual server costs by 80% — from $1M to $200k — by moving away from AWS
    Trey Huffine

    An interview with Zsolt Varga, the tech lead and general manager at Prerender



    This week we interviewed Zsot Varga, the lead engineer and manager at Prerender.io. He shares how Prerender saved $800k by removing their reliance on AWS and building in-house infrastructure to handle traffic and cached data.

    “The goal was to reduce costs while maintaining the same speed of rendering and quality of service. Migrations like this need to be carefully planned and executed, as incorrect configuration or poor execution, would cause downtime for customer web pages and social media clicks and make their search rankings suffer and potentially increase our churn rate.”

    => Be interviewed in Level Up Coding ?? Fill out this form
    => Looking for an amazing job? ??
    Visit the Level Up hiring platform

    Can you describe Prerender and the most interesting technical problem you’re solvingPrerender, in simple terms, caches and prerenders your JavaScript pages so search engines can have a pure HTML file to crawl and index, and all it needs is to have the proper middleware installed on the site, avoiding users the pain of costly and long JavaScript workarounds.

    However, all this data and processes need to happen on a server and, of course, we used AWS for it. A few years of growth later, we’re handling over 70,000 pages per minute, storing around 560 million pages, and paying well over $1,000,000 per year.

    Or at least we would be paying that much if we stayed with AWS. Instead, we were able to cut costs by 80% in a little over three months with some out-of-the-box thinking and a clear plan. Here’s how you could too.

    Planning a Migration: Our Step-by-Step GuideUp until recently, Prerender stored the pages it caches and renders for its clients using servers and services hosted on Amazon Web Services (AWS) — being AWS one of the largest cloud providers, offering virtual servers and managed services.

    Prerender had hitherto used AWS to store the pages it cached until they were ready to be picked up by Google, Facebook, or any other bot/spider looking for content to be indexed. This provided much of Prerender’s functionality — delivering static HTML to Google and other search engines, and dynamic, interactive JavaScript to human users.

    The problem? Storing multiple terabytes of prerendered web page contents in this way on a 3rd party server is hugely expensive. Storing the cached pages in this way was costing Prerender astronomical amounts of money in maintenance and hosting fees alone.

    But there was another catch that not many start-ups take into account and there’s not too much of a conversation around it: traffic cost.

    Getting data into AWS is technically free, but what good is static data for most software? When moving the data around it became a huge cost for Prerender and we started to notice the bottleneck that was holding us back.

    The solution? Migrate the cached pages and traffic onto Prerender’s own internal servers and cut our reliance on AWS as quickly as possible.

    When we did a cost projection we estimated that we could reduce our hosting fees by 40%, and decided a server migration would save both Prerender and our client’s money.

    The goal was to reduce costs while maintaining the same speed of rendering and quality of service. Migrations like this need to be carefully planned and executed, as incorrect configuration or poor execution, would cause downtime for customer web pages and social media clicks and make their search rankings suffer and potentially increase our churn rate.

    To mitigate the potential consequences, we planned a three-phase process by which we could easily revert back to the previous step if anything went wrong. If for whatever reason the new servers didn’t work, we could easily roll back our changes without any downtime or service degradation noticeable to customers.

    The caveat with continual and systematic testing is that it takes place over weeks and months.

    Moving Prerender Away From AWS: a Weekly OverviewPhase 1 — Testing (4 to 6 Weeks)Phase 1 mostly involved setting up the bare metal servers and testing the migration on a small and more manageable setting before scaling. This phase required minimal software adaptation, which we decided to run on KVM virtualization on Linux.

    In early May, the first batch of servers was running, and 1% of Prerender traffic was directed to the new servers. Two weeks into the migration, we were already saving $800 a day. By the end of the month, we’d migrated most of the traffic workloads away from AWS, reducing the daily chrome rendering workloads costs by 45%.

    On the server-side, our cost was currently at $13K per month. Combined with AWS, we had already cut our expenses by 22%.



    The testing phase was crucial to make sure the following processes would run smoothly. We worked on improving the system robustness with more monitoring & better error handling. Besides the server monitoring dashboard we already had, we also set up a new rendering monitoring dashboard to be able to spot any error or performance issue that occurred.



    Thanks to our constant monitoring and clear communication, tests were successful, our savings projections were exceeded and everything was in place to start phase 2 of the migration.

    Phase 2 — Technical Set-Up (4 Weeks)The migration period between June and early July was mostly technical set-up after the first phase of the migration served as a proof of concept. Implementation of the second phase mostly involved moving the cache storage to the bare metal servers.

    When the migration reached mid-June, we had 300 servers running very smoothly with a total 200 million cached pages. We used Apache Cassandra nodes on each of the servers that were compatible with AWS S3.

    We broke the online migration into four steps, each a week or two apart. After testing whether Prerender pages could be cached in both S3 and minio, we slowly diverted traffic away from AWS S3 and towards minio. When the writes to S3 had been stopped completely, Prerender saved $200 a day on S3 API costs and signaled we were ready to start deleting data already cached in our Cassandra cluster.

    However, the big reveal came at the end of this phase around June 24th. In the last four weeks, we moved most of the cache workload from AWS S3 to our own Cassandra cluster. The daily cost of AWS was reduced to $1.1K per day, projecting to 35K per month, and the new servers’ monthly recurring cost was estimated to be around 14K.

    At this point, there were still some leftovers on S3 which cost around $60 per day and would completely die out naturally in a few weeks. Although we could have moved all the data out to cut it to zero immediately, it would have left us a one-time “money waste” of $5K to move data out of AWS.

    Moving data around is where you’ll start running into huge bottlenecks. In the words of our new CTO (Zsolt Varga):

    The true hidden price for AWS is coming from the traffic cost, they sell a reasonably priced storage, and it’s even free to upload it. But when you get it out, you pay an enormous cost.

    Small startups often don’t calculate the traffic cost, even tho it can be 90% of their bill”

    For example, if you are in the US West(Oregon) region, you have to shell out $0.080/GB whereas in the Asia Pacific (Seoul) region it bumps up to $0.135/GB.

    In our case, it was easy around the $30k — $50k per month. By the end of phase two, we had reduced our total monthly server costs down by 41.2%.



    Phase 3 — Implementation and Scaling (4 to 6 weeks)At this stage, the migration was well underway and was already saving Prerender a considerable amount of money. The only thing left to do was migrate all the other data onto the native servers.

    This step involved moving all the Amazon RDS instances shard by shard. This was the most error-prone part of the whole process, but since a fair amount of the data had already been migrated, any hiccups or bottlenecks wouldn’t have brought the whole migration crashing down.

    Here’s a big picture view of this last stage in the migration process:

      We mirrored PostgreSQL shards storing cached_urls tables in CassandraWe switched service.prerender.io to Cloudflare load balancer to allow dynamic traffic distributionWe set up new EU private-recache serversWe keep performing stress tests to solve any performance issues
    The migration proved to be a resounding success in the end. Our monthly server fees dropped below our initial estimate of 40% to a full 80% by the time all the cached pages were redirected.

    What We LearnedThere is a lot at stake in a server migration if things go wrong or fall behind schedule. That’s why we made sure to implement fail safes at each stage of the migration to make sure we could fall back if something were to go wrong. It’s also why we tested on a small scale before proceeding with the rest of the migration.

    We avoided the dangers by carefully planning each stage of the migration, testing each stage of implementation before scaling, and making it easy to correct any errors should anything go wrong. That way, we could reap the benefits of saving on server fees while keeping any potential risks to a minimum.

    What motivated you to work on the problem that Prerender solves?I was excited by the idea to work on a platform that helps to move the web forward.

    You see, with Prerender our customers are rolling out user experience-focused websites and instead of concentrating on SEO they provide the best for their customers. In the past years anytime we built a new landing page we always used Wordpress just to get the best SEO out of it and reserved the power of SPA’s only for the non-indexed pages like the administration section. But now, I work with a company which helps to solve problems that held me back in the past ^.^

    What technology stack do you use, and why did you choose this stack?We use Javascript everywhere, since we solve the “issues” caused by Javascript rendering we want to build as much expertise as possible in this field. But for the other parts, we are taking advantage of CloudFlare’s distributed system for fast response and global scalability. While our uptime guarantees are supported by Digital Ocean’s cloud platform. We also use a myriad of other SaaS providers to maximize our effectiveness.

    What will the world look like once your company achieves its vision?When the question comes up “Can we use React for our new site?” the answer will be “For sure!”, because right now the marketing departments are always vetoing anything which can reduce the SEO ranking. I would say, rightfully. As for our customers even if they lose a 1% of effectiveness they would need to pump their ads budget with hundreds of thousands of dollars.

    What does a typical day look like for you?Haha, lots of customer calls! As we aim to keep our dedicated team small and effective, I am more often than not in the onboarding calls with them. But it’s fun for me! I always loved to talk with customers, learn about their situation, and talk about solutions. This makes my job a lot easier, since we don’t have to come up with ideas, our customers are telling us everything we need to know. And I believe this is the best kind of situation, to be customer driven and my KPI is the number of happy customers.

    Describe your computer hardware setupOh my, this would be worth an article itself. I am kinda a geek, and have 8 dedicated servers at home, while I am mostly working on my macbook for convenience. But when I get time for programming I spin up my “workstation” which runs Manjaro. But rarely when I get a bit of me time, I secretly turn on my windows pc for gaming. And at time of writing, I am surrounded by laptops, raspberries, and tablets as well.

    Building machines and running downscaled tests is my late-night hobby.

    Describe your computer software setupVSCode is a definitive solution for me, I am not really fond of any programming language and it gives me the freedom to just install an extension and write IDE supported code in seconds. Also, I had the luck to be in the beta group for CoPilot and it is a definitive game changer.

    For source control GitHub is awesome, but I would never discount other solutions either. GitLab has become a really awesome tool in recent years.

    Messaging, I think Slack still is the most widespread professional choice, and since it does its job, there is no reason to switch away from it. But recently I found a very interesting software called Spike and for the past 3 months, I have been using it as my de facto email client as it makes email conversations much easier.

    Essential tools: Docker, there is no other way, it changed the industry for the best. I still remember the dark old days when we had to install dependencies and solve package conflicts…

    But yeah, Kubernetes slowly is on the same level of adaptation.

    Do you have any advice for software engineers who are just starting out?Don’t be afraid to talk with the customers. Throughout my career, the best software engineers were the ones who worked with the customer to solve their problems. Sometimes you can sack a half year of development time just by learning that you can solve the customer’s issue with a single line of code. I think the best engineers are creating solutions for real world problems.

    Are you hiring and for what roles?Always! We always aim to only hire when we can ensure that our new colleagues will have a meaningful role and they make a definite contribution. But at the moment we have grown so much that we need to grow our team in every department. So, instead of listing just check our career page :D saas.group

    Where can we go to learn more?Check out our site at prerender.io and if you are interested to have a call with me about prerendering and how it changes the web reach me in email at varga@prerender.io I am always happy to jump on a call and learn about your situation and use cases ^.^

    Zsolt Varga is the General Manager of Prerender, a Google-recommended software tool used by more than 12,000 companies that allows search engines to better crawl and index Javascript websites.

    Share RecommendKeepReplyMark as Last Read


    From: FJB11/6/2022 10:51:09 AM
       of 425
     

    Share RecommendKeepReplyMark as Last Read
    Previous 10 Next 10