A Gentle Introduction to Algorithmia
The platform that helps you deploy, serve, and manage machine learning models at scale
What is Algorithmia?
You’ve decided it’s time to integrate machine learning into your application — you figured out how to train a model and achieve high accuracy on your test set, and now you’re ready to deploy it — way to go! The hard part is over, right? Well, it turns out that there’s a lot that goes into using that model in your application in production. You actually need to deal with concepts such as scalability, versioning, monitoring, reporting, infrastructure management — the list goes on.
Do you now have to hire a team and spend millions on building an in-house production ML platform?
No — you can instead leverage services of providers that are experts in MLOps (or DevOps for ML) and save your budget for more mission-critical tasks. Algorithmia is one of these providers.
The idea for Algorithmia was actually sparked by an observation made by the founder, Diego Oppenheimer, while he was studying for his MS in Data Analytics from Carnegie Mellon University. He realized the potential for the profound impact of ML on the world but noticed the enormous frustration that existed in bringing ML ideas from academia into production. Academics were often nervous about the same thing you were before you read the last paragraph — they didn’t have the time or resources to deal with the costs of building and maintaining a scalable production ML system. Oppenheimer came up with an interesting idea — what if there was a marketplace where data scientists and machine learning engineers could build and monetize artificial intelligence and machine learning models without dealing with the burdens of MLOps?
This later evolved into modern-day Algorithmia, a platform that assists organizations in deploying their models into production quickly while providing the capabilities of monitoring, management, and security all in one place.
It’s no secret that managing the life cycle of machine learning projects can be tricky. Between the expensive process of training sophisticated models (and fine-tuning them) and integrating them with existing applications, most ML models that are developed actually never even make it to production — 85% according to Algorithmia! Using the platform's powerful pipeline automation tools, 110,000 engineers and data scientists so far (including organizations ranging from the UN to Fortune 500 companies) have reaped the benefits of using a fully managed MLOps service.
Scenario: Movie Recommender System
Now that we have a better understanding of what Algorithmia is all about let’s walk through an example of deploying a basic movie recommendation algorithm to production using Algorithmia.
Algorithm
We want to deploy an algorithm that takes one movie ID as input and returns a list of 20 pairs of movie IDs and similarity scores, which belong to the 20 most similar movies based on how they were rated by users (item-based collaborative filtering).
Creating an algorithm on Algorithmia
Once we have signed into Algorithmia’s MLOps platform, we can create a new algorithm by clicking the Create New button in the top right corner of the dashboard and selecting Algorithm. We will name the algorithm ItemBasedCF.
We can choose to host the repository in GitHub or Algorithmia, but we chose the latter for this example. Once the algorithm has been created, we can click on the Source Code tab to add our code to the repository. However, before we do that, we first need to add a new Data Collection to store our trained model and other data required by our algorithm. Thankfully, Algorithmia has data hosting capabilities for every user, which we can find under the Data Sources page. Click My Hosted Data followed by New Collection to create a data folder that our algorithm can access. Then, we upload the required files to the newly created data collection.
Finally, we click on the Source Code tab, which will open Algorithmia’s IDE, where we can start adding out code for the algorithm. The API calls to our algorithm will trigger the apply method, which takes one JSON object as an input argument. We add our code to the automatically generated ItemBasedCF.py file, which gives us the following.
Before we proceed, we must add the algorithm’s dependencies by clicking on the DEPENDENCIES button. When everything looks in order, we can deploy our algorithm by clicking SAVE, followed by BUILD. It may take a few minutes for the build to finish. Algorithmia also keeps a log of details of all the algorithms that have been built under the Builds tab. Once the build has been finalized, we can test our algorithm by entering a valid movie ID into the console at the bottom of the page.
And voila, we have finished creating the algorithm! We can now click PUBLISH to deploy version 1.0.0 of our newly created algorithm to production. Furthermore, Algorithmia keeps track of all algorithm versions, which can be accessed under the Versions tab.
Using the newly deployed algorithm
Our movie recommendation system can now use the algorithm to fulfill recommendation requests. Algorithmia offers client libraries in several languages, including Python, Java, JavaScript, Node.js, Go, C#, Scala, Swift, and many others, thereby ensuring seamless integration between our algorithm and our service. Here are two examples of how to call our algorithm from a Python client and a JavaScript client.
Tradeoffs
Algorithmia boasts some impressive statistics — they advertise that their platform leads to 12x faster model deployment, and it is 7x more cost-effective (compared to other cloud providers. This is definitely where Algorithmia shines — no doubt, using their service will greatly improve your model deployment efficiency and reduce the associated costs.
For startups and midsize companies especially, the service Algorithmia provides is especially important. As one Senior Data Scientist and Toyota points out — having a “white glove” on-call DevOps service saves them from having to develop their own DevOps system from scratch, which would require the company to staff a full DevOps team. When resources aren’t unlimited, as is the case for most companies, paying for a fully managed MLOps service seems like the way to go, allowing the company to focus their budget on essential core resources — software engineers to develop and improve the application and data scientists to focus on the model itself.
Remember when I mentioned that Algorithmia was originally conceived as a marketplace for hosting and monetizing AI and ML models? This is actually part of Algorithmia’s products — a huge library of pre-trained models that data scientists can upload and allow others to integrate into their codebase or interact with the Algorithmia CLI. Besides allowing data scientists to bring in some revenue from their machine learning creations, it importantly provides equal access to many of today’s cutting-edge AI algorithms. The fact is, in today’s world, few companies have the resources to train these highly sophisticated algorithms from scratch. Consider GPT3, the OpenAI language model that is perhaps the best in its class — the training cost exceeded $4.6 million, a figure that is most likely beyond what the typical company can afford for training a single ML model. One of the most popular algorithms on the marketplace is offered by UC Berkeley computer vision labs, an algorithm that colorizes black and white photos.
As users have pointed out, some drawbacks are that using Algorithmia isn’t always as smooth as it seems — Algorithmia is, after all, a startup, so they aren’t quite at the point where they can provide seamless “white glove” service 100% of the time. That being said, if they proceed on their current trajectory, they will grow rapidly as AI becomes more integrated into our society. Others have expressed frustration that the process of retraining models with new training data could be made more intuitive.
So what does this all cost? For teams or small to medium-sized businesses, Algorithmia allows you to pay as you go for computation costs — coming out to $0.0001/sec for CPU and $0.0004/sec for GPU. To get 24/7 priority support, though, that will set you back $300/month. For larger groups looking for a more customized experience, they can opt for Algorithmia “Enterprise Dedicated” or even “Enterprise Advanced” to gain access to features such as virtual private cloud or on-premises hosting, as well as advanced security and privacy features.
Regardless of the option you choose, it is extremely likely that it will save you a great deal of money and time to outsource your MLOps management, and you can’t go wrong with choosing Algorithmia for this service.