Posted on June 23 2021


Plotting of data in Python

What is data?

Data is any information such as numbers, words, measurements, observations or just a description of things stored in a digital format which can then be used as a base for performing analyses and decision making. Data can be qualitative and quantitative, where qualitative data is descriptive information and quantitative data is numerical information. Data are of two types:

  • Traditional data: Traditional data is structured data in the form of tables containing numeric or text values and the size of the data is very small hence easy to manage and manipulate the data. Its volume ranges from gigabytes to terabytes and is managed in a centralized form. Traditional data uses Structured Query Language (SQL) for managing and accessing the data.
  • Big data: Big data are extremely large data and can be considered as an upper version of traditional data. data are its 5 Vs – volume, velocity, variety and value. The characteristics of big Its data source includes images, audio files, device data, sensor data etc and deals with data that are structured, semi-structured and unstructured to extract meaningful data by analyzing the huge amount of complex datasets. Its volume ranges from petabytes to exabytes or zettabytes and can be managed in distributed form.

Data visualization in Python

Data visualization is the discipline of trying to understand by placing data in a visual context to detect patterns, trends and correlations. There are multiple interfaces and ways of data plotting some of which are column charts, bar graphs, area charts, scatter plot charts, heat maps, bubble charts and many more. Python is one such interface that offers multiple great graphing libraries that come packed with lots of different features with interactive, live or highly customized plots. Some of the popular plotting libraries are-

  1. Matplotlib: It is the oldest as well as the most popular plotting library which gives precise control over plots. It is a low-level library with a Matlab like interface which offers a lot of freedom to write more codes. It is an open source scientific computing library and is especially good for creating basic graphs like line charts, bar charts and histograms.
  2. Seaborn: Seaborn is a matplotlib based data visualization library with a high level and neat interface for preparing attractive and informative statistical graphics. It provides dataset oriented API to determine the relationship between variables, automatic estimation and plotting of linear regression plots. The seaborn interface supports high-level abstraction for multi-plot grids and visualizes invariant and bivariate distribution plots.
  3. Plotly: Plotly is a javascript library that also includes a python plotting library and has three different interfaces- an object-oriented interface, an imperative interface to specify plots using JSON-like data structures and a high-level interface (plotly express)that is similar to Seaborn.



Integrating Python with GIS interface

Out of some of the commonly used GIS interfaces like ArcGIS, Quantum GIS and Earth Resources Data Analysis System (ERDAS) Imagine, scripts and codes can be run on the QGIS interface. QGIS is an open source software and has plugin available for processing Python codes and scripts. The QGIS Python Console plugin is an interactive shell for executing python commands and also has a python file editor to edit and save the scripts. In QGIS, python can also be used in the following ways:

  • Issue commands from Python Console
  • Writing custom expressions and actions
  • Creating new processing algorithms
  • Creating new plugins and custom standalone applications.