GPU Can Speed Up Models

In this article, we will look at how to speed up your models by using a GPU. We will also see how to split the computations across multiple devices, including the CPU and numerous GPU devices.

Thanks to GPUs, instead of waiting for days or weeks for a training algorithm to complete, you may end up waiting for just a few minutes or hours. This saves an enormous amount of time, but it also means that you can experiment with various models much more quickly and frequently retrain your models on new data.

Source – O’Reilly

You can often get a significant performance boost by merely adding GPU cards to a single machine. In fact, in many cases, this will suffice; you won’t need to use multiple machines at all, For example, You can typically train a neural network just as fast using four GPUs on a single machine rather than eight GPUs across multiple machines, due to the extra delay imposed by network communications in a distributed setup. Similarly, using a single powerful GPU is often preferable to using various slower GPUs.

Using a GPU-Equipped Virtual Machine

All major cloud platforms now often GPU VMs, some preconfigured with all the drivers and libraries you need (including TensorFlow). Google cloud platform enforces various GPU quotas, both worldwide and per region: You cannot just create thousands of GPU VMs without prior authorization from Google. By default, the worldwide GPU quota is Zero, so you cannot use any GPU VMs.

Therefore, the very first thing you need to do is to request a higher worldwide allowance. In the GCP Console, open the navigation menu and go IAM & admin – quotas. Click Metric, click none to uncheck all location, then search for “GPU” and select “GPUs’’ to see the corresponding allowance. If this quota’s value is zero, then check the box next to it and click “Edit quotas.” Fill in the requested information, then click “submit a request.”

It may take a few hours ( or up to a few days) for four quota requests to be processed and accepted. By default, there is also a quota of one GPU per region and per GPU type. You can request to increase these quotas too: click Metric, select None to uncheck all metrics, search for “GPU,” and choose the type of GPU you want ( e.g., NVIDIA P4 GPUs). Then click the location drop-down menu, click none to uncheck all metrics and the place you wish to; check the boxes next to the quota(s) you want to change, and click “Edit quotas” to file a request.

Once your GPU quota requests are approved, You can in no time create a VM equipped with one or more GPUs by using Google Cloud AI Platform’s Deep Learning VM images: go to view console, then click “ Launch on compute Engine” and fill in the VM configuration form. Note that some locations do not have all GPUs, and some have no GPUs at all ( change the site to see the type of GPUs available, if any).

Make sure to select TensorFlow 2.0 as the framework, and check “Install NVIDIA GPU driver automatically on the first startup .” It is also good to check “ Enable access to JupyterLab via URL instead of SSH”: this will make it very easy to start a jupyter notebook running on this GPU VM, powered by jupyterLab (this is an alternative web interface to run Jupyter notebooks).

Once the Notebook instance appears in the list (this may take a few minutes, click Refresh once in a while until it seems), click its open jupyterlab link. This will run Jupyterlab on the VM and connect your browser to it. You can create notebooks and run any code you want on this VM, and benefit from its GPUs.

But if you want to run some quick tests or easily share notebooks with your colleagues, you should try Colaboratory.

Google Colaboratory – Free GPU

The simplest and cheapest way to access a GPU VM is to use colaboratory ( or colab, for short). It’s free! Just go to Google Colab and create a new Python 3 notebook: this will create a jupyter notebook on your Google Drive ( alternatively, You can open any notebook on GitHub, or Google Drive, or you can even upload your notebooks).

Google Colab

Colab’s user interface is similar to Jupyter’s, except you can share and use the notebooks like regular Google Docs. There are a few other minor differences (e.g., you can create handy widgets using individual comments in your code).

When you open a Colab notebook, it runs on a free Google VM dedicated to you, called a colab Runtime. By default, the Runtime is CPU- only. But you can change this by going to Runtime- “Change runtime type,” selecting GPU in the “ Hardware accelerator” drop-down menu, then click Save. You could even choose TPU! (yes, you can use a TPU for free; we will talk about TPUs later in this chapter, though, so, for now, select GPU.)

If you run multiple colab notebooks using the same runtime type, they will use the same colab Runtime. So if one writes to a file, the others will be able to read that file. It’s essential to understand the security implications of this. If you run an untrusted colab notebook written by a nasty hacker, it may read private data produced by the other notebooks and then leak it back to the hacker. If this includes private access keys for some resources, the hacker will gain access to those resources.

Moreover, if you install a library in the colab runtime, the other notebooks will also have that library. Depending on what you want to do, this might be great or annoying ( e.g., it means you cannot easily use different versions of the same library in different colab notebooks).

Colab does have some restrictions: as the FAQ states, “Colaboratory is intended for interactive use. Long-running background computations, particularly on GPUs, may be stopped. Please do not use colaboratory for cryptocurrency mining.” The web interface will automatically disconnect from the colab Runtime if you leave it unattended for a while (~30 minutes).

When you reconnect to the colab Runtime, it may have been reset, so make sure you reconnect to the colab Runtime, it may have been reset, so make sure you always download any data you care about.

Even if you never disconnect, the colab Runtime will automatically shut down after 12 hours, as it is not meant for long-running computations. Despite these limitations, it’s a fantastic tool to run tests quickly, get quick results, and collaborate with your colleagues.

Also, Read – Natural Language Processing for WhatsApp Chats.

I hope you liked this article on Using GPU to Speed up your models. Feel free to ask your valuable questions in the comments section. Don’t forget to subscribe for the daily newsletters below to get email notifications if you like my work.

Follow Us:

Default image
Aman Kharwal

I am a programmer from India, and I am here to guide you with Data Science, Machine Learning, Python, and C++ for free. I hope you will learn a lot in your journey towards Coding, Machine Learning and Artificial Intelligence with me.

Leave a Reply