Virtual Machines for Data science
A convenient way to practice data science on your laptop.
Installation instructions
For VirtualBox and Vagrant
Please see this presentation to help you get through most common errors and issues:
Python Virtual Machine
Download Vagrant file to get started:

Use this virtual machine with all popular Python libraries and databases installed for various data Science and engineering tasks.
Python scientific libraries:
- sci-kit learn (sklearn)
- pandas
- nltk
- xgboost
- numpy
- networkX
Visualization libraries:
- matplotlib + basemap
- seaborn
- bokeh
- plotly
Social connector libraries:
- tweepy
- facebook-sdk
- python-linkedin
Web framework libraries:
- flask
- django
Databases:
- Cassandra
- Neo4j
- MongoDB
- MySQL
Download Vagrant file to get started:
Spark Virtual Machine
Download Vagrant file to get started:

- This virtual machine allows developing Spark applications with Python on your local machine within the single-node Spark cluster.
- Use iPython notebook editor to write and execute your pySpark programs.
- Educational datasets and labs from Data Science School are included in this machine.
Download Vagrant file to get started:
Spark 2.0 Virtual Machine
Download Vagrant file to get started:

- This virtual machine allows developing Spark 2.0 applications with Python on your local machine within the single-node Spark cluster.
- Use iPython notebook editor to write and execute your pySpark programs.
- Educational datasets and labs from Data Science School are included in this machine.
Download Vagrant file to get started:
Benefits of virtual machines

Easy to use

Comprehensive set of libraries

Aligned with industry standards
Contacts
Send us your message if you have any idea or feedback.