User Tools

Site Tools


datascience

Data Science for Non-Coders

Science in general is common for all. Anybody can learn it, practice it, share it, experiment it, question it. Its method of practice for any specific discipline often comes with a learning curve. For some sciences, the learning and unlearning process is more depending upon its evolution throughout history and whether is it being useful or got used (objectification) by the society. Technology cannot be scientific without practicing scientific principles. By default any technology is neither good, nor bad, not even neutral. It is only the scientific process and social participation that makes the corresponding technology pro people, democratic, and scientific - combating anti-scientific forces that coexists in the society.

Even when the field has no restriction for participation, its current honorary status builds artificial barrier for which the historically built knowledge itself contributes for. Thus with time, to make science more accessible, participating community has to unlearn, dehonour unnecessary fame, etc…

Data Science

Data science can be interpreted as a field that provides scientific methods and process to practice data operations and have always existed from the time whenever humans have started to quantify and organize the knowledge based upon the data, information, facts they have experienced, collected, observed from their environment. Such practice has always been useful to organize information and thereby form proper knowledge to understand historical labour contributed to such knowledge contribution and how it can be useful to the society on the whole.

In general it can be viewed from 2 different perspectives

  1. Explorative Data Science
  2. Explanatory Data Science

Data Science is not just limited to scientific method of analyzing data, but spans a broad spectrum of following practices :

  1. Data Expectation
  2. Field Survey
  3. Systems Identification
  4. Data Extraction
  5. Measurement & Instrumentation
  6. Data Acquisition
  7. Data Transmission & Communication
  8. Data Collection
  9. Data Analysis
  10. Statistical Pre/Post processing
  11. Visualization
  12. Systems Modeling
  13. Pattern/Behavior Extraction
  14. Inferencing
  15. Reporing & Communication
  16. Information & Knowledge formation
  17. Research
  18. Democratization & Socialization of Data & Methods

Many of the above process does not require coding. Even if coding is present while using software tools, they come process of better tool & platform making. Which we accept as fundamental collaborative peer production process resulting in a high quality and useful tool set that compliments the skill. Thus Data tools becomes a vital and central point for enabling participation with little learning curve while being transparent allows the new comers to select the learning curve travel based upon the time they have presents.

Tools

At every turn of history of science & technology, skills, tools, labour decided further course. Evolution of used tools, and skills are itself again products of historical labour. Thus in contemporary usage of tools to practice data science dictates how and what kind of science one is going to practice. A tool that is developed collaboratively by fellow hackers, developers, researchers, statisticians, mathematicians, etc… and also distributed democratically ensures the transparency thereby forming a more available, accessible, affordable ecosystem that reduces the pain in learning and unlearning new set of tools and skills, resulting a course towards progression.

In that sense, there are a number of free software, open data supporting tools as listed below:

Data Collection, Survey Tools

No. Name Type Link License
1. Open Data Kit Data Collection Suite ODK Apache 2.0
2. Open Rosa Data Collection Open Rosa ODK NA
3. GeoODK Data Collection Geo ODK NA
4. Kobo Tool Box Data Collection/Survey Kobo Tool Box Apache 2.0
5. Enketo Data Collection Enketo Apache2.0

Statistical Analysis, Data Science Tools

No. Name Type Link License
1. GNU Scientific Library Library GSL GPL
2. GNU PSPP Application PSPP GPLv3
3. Gretl Application, Econometrics gretl GPLv3
4. SciKitLearn Library scikit-learn New BSD
5. Orange Application, DM & ML Orange GPLv3
6. R Application, Library R GPLv2
7. Jamovi Applicaiton, Library Jamovi AGPLv3, GPL2+
8. Shogun Application, Library ShogunToolbox BSD 3 Clause
9. Stan Library, Modeling Stan New BSD, GPLv3
10. Pandas Library Pandas BSD
11. Xarray Library xarray Apache
12. SOFA Application SOFA AGPLv3
13. GNU Data Language Application, Library GDL GPLv2
14. SciPy Library SciPy License
15. Numpy Library NumPy BSD

To Learn DataScience : https://datalab.cc/

Data Visualization

No. Name Type Link License
1. Rawgraph Vector Data Visualization based on D3 Rawgraph Apache 2.0
datascience.txt · Last modified: 2019/04/17 16:54 by Ganesh