Systems | Development | Analytics | API | Testing

Cloud

The engineering behind autoscaling with HashiCorp's Nomad on a global serverless platform

There are several ways to handle load spikes on a service. However, these methods are not cost-effective: you either pay for resources you don't use, or you risk not having enough resources to handle the load. Fortunately, there is a third way: horizontal autoscaling. Horizontal autoscaling is the process of dynamically adjusting the number of instances of a service based on the current load. This way, you only pay for the resources you use, and you can handle load spikes without any manual intervention.

A Software Engineer's Tips and Tricks #3: CPU Utilization Is Not Always What It Seems

Hey there! We're back for our third edition of Tips and Tricks. As we said in our first posts on Drizzle ORM and Template Databases in PostgreSQL, our new Tips and Tricks mini blog series is going to share some helpful insights and cool tech that we've stumbled upon while working on technical stuff. Today's topic is short and sweet. It'll be on CPU utilization and what that metric indicates. If you enjoy it and want to learn more, I encourage you to check out the "further reading" links.

Overcoming scale challenges with AWS & CloudFront - 5 key takeaways

The Ably service handles massive amounts of data throughput and concurrent connections for many customers while maintaining a highly reliable and available service, with a 5x9s uptime guarantee. Ably has no scale ceiling, and that’s challenging work (it’s one of the reasons I joined Ably). While the challenges we face in delivering our service are compelling, we sometimes face novel internet scale problems, such as breaching the limits of AWS services!

Serverless GPUs in Private Preview: L4, L40S, V100, and more

Today, we’re excited to share that Serverless GPUs are available for all your AI inference needs directly through the Koyeb platform! We're starting with GPU Instances designed to support AI inference workloads including both heavy generative AI models and lighter computer vision models. These GPUs provide up to 48GB of vRAM, 733 TFLOPS and 900GB/s of memory bandwidth to support large models including LLMs and text-to-image models.

A Software Engineer's Tips and Tricks #2: Template Databases in PostgreSQL

Hey there! We're back for our second edition of Tips and Tricks. As we said in our first post on Drizzle ORM, our new Tips and Tricks mini blog series is going to share some helpful insights and cool tech that we've stumbled upon while working on technical stuff. Today, we're going to talk about the template databases of PostgreSQL. Remember, these posts will be super short reads. If you don’t like the topic of one of the posts, no problem! Just skip it and check out the next one.

What are LLMs? An intro into AI, models, tokens, parameters, weights, quantization and more

To keep up with everything happening in the world of artificial intelligence, it helps to understand and grasp key terms and concepts behind the technology. In this introduction, we are going to dive into what is generative AI, looking at the technology and models they are built on. We'll discuss how these models are built, trained, and deployed into the world.

A Software Engineer's Tips and Tricks #1: Drizzle

Hey there! At Koyeb, we really like diving into technical stuff. But here’s the thing: not every cool thing we stumble upon or think about needs a massive blog post. And honestly, not everything we’re into is directly related to what Koyeb does or about infrastructure in general. So, we’ve got an idea: what if we start sharing these bits and pieces with you in a series of really short blog posts?

What is a Microapp: An Emerging Trend

The microapp trend is on the rise! In the approximately two years since joining the DreamFactory team, I’d estimate I’ve conversed with more than one thousand companies about their API-based projects. These conversations provide a great opportunity to peer inside the IT operations of organizations large and small, not to mention pick up on emerging technology trends.