GPUs for Machine Learning

Last updated: August 18, 2022

A graphics processing unit (GPU) is specialized hardware that performs certain computations much faster than a traditional computer’s central processing unit (CPU). As the name suggests, GPUs were originally developed to accelerate graphics rendering — particularly for computer games — and free up a computer’s primary CPU cycles for other jobs.

In recent years GPUs have been used by data scientists to accelerate their research, both figuratively and literally. When GPUs are combined in a cluster, they create a sort of supercomputer that is uniquely qualified to perform certain tasks, and in particular they are ideally suited for machine learning. GPUs process data with a large number of cores, which allows for better processing of multiple computations simultaneously, and they have high memory bandwidth to handle huge amounts of data, which makes them uniquely qualified for many data research tasks.

The UW offers several alternatives for researchers to harness the power of GPUs for their research computing needs.

More about GPUs

What are different types of GPU?
In the past GPUs were integrated, embedded on a computer’s motherboard. Currently GPU use adapted to general purpose computing is called, appropriately, general purpose computing on graphics processing units, abbreviated GPGPU. With this evolution in their use, GPUs have been promoted — with a commensurate increase in computing power — from integration on the motherboard to an external enclosure with independent power. These external GPUs are connected to a host computer using a high speed connection such as a peripheral component interconnect (PCI) bus or proprietary NVlink in the case of NVIDIA GPUs..

Why do I want a GPU?
You want a GPU only if you need a GPU. There is typically a time investment to learn the necessary software tools and methods. That said there are software packages available such as Rapids intended to minimize your time spent on new learning curves. Data scientists using the Machine Learning method called Deep Learning find that GPUs are extremely useful in accelerating their work in training models. Other GPU-friendly research topic examples include geospatial analysis of forest conditions, the design of therapeutic proteins and the search for near-earth asteroids, all tasks where a large dataset can be broken up and fed into independent analysis pipelines. If your research computing may benefit from GPU acceleration we encourage you to look into this technology further.

GPU alternatives for UW researchers

There are three ways to leverage GPUs at the UW: Purchase GPU nodes in the UW’s Hyak supercomputer, buying (or more accurately, renting) GPU processing time in the Commercial Cloud, or using a dedicated GPU cluster, know as the Google Colaboratory

Hyak is the UW’s supercomputer, which has GPU nodes available to purchase for use in machine learning. Your nodes are yours to use 24x7x365, for a minimum period of four years (often longer), so this option is preferred if you have long-running jobs over a long period of time.

  • Hyak is operated as a condo-style model, whereby you buy hardware at cost, which is then integrated into the shared platform, which includes a software environment and is fully supported by UW-IT.
  • Free access is available for evaluation purposes.
  • Students get free access to Hyak for personal education, independent study, exploration and research purposes. Student access is for nodes purchased with Student Tech Fee funding and through membership in the Research Computing Club. Use for faculty research projects and class projects is not permitted through this access.
  • You can also make use of idle capacity on Hyak of other users’ nodes, for times when you need more GPUs. Similarly, others can use your nodes when you are not, but they are always immediately available for your use whenever you want them. Use of idle nodes is always on a pre-emption basis.
  • More about Hyak

UW-IT works with cloud providers to provide access to virtually unlimited amounts of GPU computing. The commercial cloud offers flexibility for researchers who only need to use GPU processing occasionally, but who find that the Colab is not sufficient.

  • Use GPU processing time on demand, and only pay for what you use. You can scale up to thousands of nodes instantaneously, limited only by your budget, and always under your control.
  • Several pricing models available, with deep discounts for long-term commitments.
  • Research grants are available from all three commercial cloud providers, up to $5,000 in value.
  • Students get free access to Hyak for personal education, independent study, exploration and research purposes. Student access is for nodes purchased with Student Tech Fee funding and through membership in the Research Computing Club. Use for faculty research projects and class projects is not permitted through this access.
  • More about the commercial cloud

A multi-user computing environment hosted on Google’s cloud computers. This environment is accessed via web browser and is built on an essential technology called the Jupyter Notebook. The Colab service features modest GPU support for machine learning workflows.

This is an option you should consider for exploratory purposes or relatively small jobs. Importantly, Colab cannot be used for UW courses, and cannotbe used for confidential or sensitive data; more information. JupyterHub for Teaching is a potential option for courses. Google Colaboratory is available for free, and via a subscription to Colab Pro for $10/month.

  • Offers free use of GPUs in a dedicated “Jupyter Notebook”, running on Google Cloud Platform.  There are limitations for free use, but a subscription model is available, where you pay by the month for better access and more resources.
  • A Colab Pro subscription for $10/month gives more capabilities (e.g., more sustained access, and access to higher memory GPUs).
  • Watch Introduction to Colab to learn more.
  • Read Colab Pro: Is it Worth the Money?, which compares the free and subscription versions.
  • More about Google Colaboratory

Which GPU option should I choose?

Which option you choose depends on several factors, including how much processing power you need, how big or long-term your project is, and how frequently you will need to use GPU processing time. This brief document will give you some things to consider, but it will definitely be worth discussing this question with the Research Computing team.

Think about your alternatives for GPU computing similar to the concept of owning a car vs using an on-demand ride-share service. If you own a car, it’s yours to use how you want. You can drive it once a week to the grocery, or you can take it on a months-long road trip, using it most of the day, every day. However, owning a car is a significant investment with an upfront cost, a capital expenditure (CapEx), so you’d likely only buy one if you needed it, or wanted the freedoms owning a car provides.

On the other hand, if you only need to use a car some of the time for specific tasks, you might choose to not own a car but use an on-demand ride-share service. With these services, you have no upfront cost, and only pay for the benefits of a car when you need them, an operational expenditure (OpEx). “Lease” options are also available, similar to renting a car for a few weeks, with deep discounts over ride-share.

You can use the Cloud TCO tool as a convenient way to compare the various options from a cost standpoint, including quantitative and qualitative factors.

Reasons to choose this option Reasons this option might not be ideal
Hyak
  • Dedicated resources for your research. Use whenever you need, for four years
  • You need to use GPUs for sustained, high-volume computations over long periods of time
  • You can tap into additional resources in Hyak as they are available with a shared “preempt queue” that lets you use spare idle GPU cycles in Hyak
  • You pay only node cost, for Arts & Sciences, College of Engineering, and College of the Environment. Others pay node cost plus “slot fees”. See Hyak for details.
  • You only need GPU resources for a short-term project, or have very intermittent need to use GPUs
  • You aren’t sure if GPUs are necessary for your research and don’t want to commit to buying four years on Hyak
  • Requires a capital expenditure to purchase and set up
Commercial Cloud
  • Flexibility to use and pay for only what you need
  • Best suited for intermittent or occasional usage over time
  • Reduced rates through UW-IT relationships (~5%; 25% for NIH grants on GCP)
  • Deep discounts available to use idle capacity on the commercial cloud (preemptible).
  • ‘Cloud Credits’ are available through grant programs from Cloud Providers, typically up to $5,000.
  • May be more expensive for heavy usage of high performance CPUs
Colab
  • Free to use GPU platform, with an affordable “pro” subscription that gives you more resources if you need them
  • Your need for GPUs is primarily exploratory or short-term
  • May be too slow or too limited for large jobs
  • Cannot be used for confidential or sensitive data, which includes use in courses. More information.

Ready to leverage the power of GPUs?

Research computing experts at UW-IT can help you determine if GPU computing would benefit your research needs, and help you navigate the alternatives to choose an option that works for you. Contact UW-IT to start leveraging the power of GPU computing. Email help@uw.edu to get started.

Get started with GPUs