Introduction to scaling Large Model training and inference using DeepSpeed
DeepSpeed library optimizes training and inference for foundational models. Learn more about its features.
DeepSpeed library optimizes training and inference for foundational models. Learn more about its features.
CLIP introduces a model that enables zero shot learning for a new dataset (in addition to a new example) by using NLP to supervise pre-training. i.e., To identify an object, you can provide the name or description of a new object that the model has not seen before. Traditionally a computer vision model was trained … Read more
Why S3 lifecycle policies S3 lifecycle policies allow you to do two things: Reduce cost by deleting data that is not longer required. Implement your security policies by : Retaining data for the required duration and reducing cost by moving it to low cost storage. Deleting data that you are not allowed to retain for … Read more
In this blog I will show you how to deploy a static website on Amazon S3, along with an HTTPS endpoint and content delivered by CloudFront CDN. Here’s a video of the steps Six steps to deploying a static website on Amazon S3 Step 1. Create a Bucket As the first step I create a … Read more
Encryption allows you to store objects in such a way that only an entity that has the encryption key, or access to the encryption key can access that object. S3 Encryption types There are five ways in which you can encrypt objects that you write to S3. S3 Server Side Encryption using Amazon S3 Managed … Read more
There are three ways to control access to s3 bucket and its objects Using bucket policies. Using bucket Access Control Lists (ACL) Using User policies ACL is used only in cases where Objects are not owned by the bucket owner. I.e. objects are uploaded by another account and the bucket owner does not own these … Read more
In this blog we will talk about S3 pricing. Important points to consider are: Pricing is based on region. In this blog we will look at pricing for Sydney region. Pricing varies based on storage used. The more storage you use, the lesser is the price per GB. Here’s the storage cost for 1000G in … Read more
Before we describe what Batch Normalisation is, here are a few introductory terms Internal covariate shift Stochastic Gradient Descent uses a minibatch of input to train the parameters of a layer. The input to a layer is the output from the previous layer. A change in the parameters of the previous layer causes a change … Read more
Why Distributed Machine Learning? In the previous article we looked at how GPGPU, ASICS, AWS’s Inferentia, the new NVidia A100 chip and other advances in hardware have tremendously improved the performance of Machine Learning training and inference. However the increase in the volume of data and the increasing complexity of the machine learning models require … Read more
Machine learning algorithms vary in size from a few parameters to a few billion parameters (e.g. GPT-3). The training data ranges from a few hundred training row to millions of rows. Training a model on a single CPU is not always efficient and so people started using GPU. GPU vs CPU vs GPGPU Wait a … Read more