Virtual Machines for Data science

A convenient way to practice data science on your laptop.

How It Works

Download and install VirtualBox

Download and install VirtualBox

Download and install Vagrant

Download and install Vagrant

Download and install Virtual Machine

Download and install Virtual Machine

Launch Python Notebook and start practice

Launch Python Notebook and start practice

Installation instructions

For VirtualBox and Vagrant

Please see this presentation to help you get through most common errors and issues:

Python Virtual Machine

Download Vagrant file to get started:

Description

Use this virtual machine with all popular Python libraries and databases installed for various data Science and engineering tasks.

Python scientific libraries:

  • sci-kit learn (sklearn)
  • pandas
  • nltk
  • xgboost
  • numpy
  • networkX

Visualization libraries:

  • matplotlib + basemap
  • seaborn
  • bokeh
  • plotly

Social connector libraries:

  • tweepy
  • facebook-sdk
  • python-linkedin

Web framework libraries:

  • flask
  • django

Databases:

  • Cassandra
  • Neo4j
  • MongoDB
  • MySQL

 

 


Download Vagrant file to get started: 

 

Spark Virtual Machine

Download Vagrant file to get started:

Description
  1. This virtual machine allows developing Spark applications with Python on your local machine within the single-node Spark cluster.
  2. Use iPython notebook editor to write and execute your pySpark programs.
  3. Educational datasets and labs from Data Science School are included in this machine.

Download Vagrant file to get started:

Spark 2.0 Virtual Machine

Download Vagrant file to get started:

Description
  1. This virtual machine allows developing Spark 2.0 applications with Python on your local machine within the single-node Spark cluster.
  2. Use iPython notebook editor to write and execute your pySpark programs.
  3. Educational datasets and labs from Data Science School are included in this machine.

Download Vagrant file to get started:

Benefits of virtual machines

Easy to use

Easy to use

Comprehensive set of libraries

Comprehensive set of libraries

Aligned with industry standards

Aligned with industry standards

Sponsored By