Before we start developing models, we need to few tools to help us. Regardless of whether you are using a Mac, PC, or Linux, almost everything we use is compatible with all platforms. There are three main items we need to install: a language to develop our models in, a database to store our data in, and a cloud computing space to deploy our models in. There is a fantastic technology stack ready to support these needs. We can use the Python programming language to develop our models, MySQL to store our data, and AWS to run our cloud computing processes. Let's take a closer look at these three items.
Python (programming language)
Python is one of the most commonly used programming languages and sought-after skills in the data science industry today. There are several ways you can install Python on your computer. You can install the language in its standalone form from Python.org. This will provide you with a Python interpreter in its most basic form where you can run commands and execute scripts. An alternative installation process that would install Python, pip (a package to help you install and manage Python libraries), and a collection of other useful libraries can be done by using Anaconda, which can be retrieved from anaconda.com. To have a working version of Python and its associated libraries on your computer as quickly as possible, using Anaconda is highly recommended. In addition to Python, we will need to install libraries to assist in a few areas. Think of libraries as nicely packaged portions of code that we can import and use as we see fit. Anaconda will, by default, install a few important libraries for us, but there will be others that we will need. We can install those on-the-go using pip.
MySQL (database)
When handling vast quantities of information, we will need a place to store and save all of our data throughout the analysis and preprocessing phases of our projects. For this, we will use MySQL, one of the most common relational databases used to store and retrieve data. We will take a closer look at the use of MySQL by using SQL. In addition to the MySQL relational database, we will also explore the use of DynamoDB, a non-relational and NoSQL database that has gained quite a bit of popularity in recent years.
AWS and GCP (Cloud Computing)
Finally, after developing our machine learning models in Python and training them using the data in our databases, we deploy our models to the cloud using both Amazon Web Services (AWS), and Google Cloud Platform (GCP). In addition to deploying our models, you can also explore a number of useful tools and resources such as Sagemaker, EC2, and AutoPilot (AWS), and Notebooks, App Engine, and AutoML (GCP).
0 comments:
Post a Comment