Skip to content

Latest commit

 

History

History
66 lines (48 loc) · 3.66 KB

Datascience.md

File metadata and controls

66 lines (48 loc) · 3.66 KB

Datascience

The field of study known as data science works with enormous amounts of data using cutting-edge tools and methods to uncover hidden patterns, glean valuable information, and make business decisions. Data science creates predictive models using sophisticated machine learning algorithms.

The information used for analysis can be given in a variety of formats and come from a wide range of sources.

The Data Science Lifecycle

Data science’s lifecycle consists of five distinct stages, each with its own tasks:

  • Capture Data extraction, signal reception, data entry, and data extraction. During this phase, raw, unstructured, and structured data must be gathered.

  • Maintain Data Architecture, Data Staging, Data Cleaning, and Data Processing. This stage deals with transforming the raw data into a usable form.

  • Process Data modelling, data summarization, and clustering/classification. To establish how effective the prepared data will be for predictive analysis, data scientists take the data and examine its patterns, ranges, and biases.

  • Analyze Exploratory/confirmatory, Regression, Text Mining, Predictive Analysis, and Qualitative Analysis. The lifecycle's actual meat is located here. The numerous analysis of the data are conducted during this phase.

  • Communicate Business intelligence, data visualization, data reporting, and decision-making. In this last step, analysts format the analyses into forms that are simple to read, like reports, charts, and graphs.

Prerequisites

  1. Machine Learning Data science is built on machine learning. Data Scientists require a thorough understanding of ML in addition to a foundational understanding of statistics.

  2. Modeling You may quickly calculate and predict using mathematical models based on the data you already know. Machine learning also includes modelling, which is determining which algorithm is best suited to handle a certain issue and how to train these models.

  3. Statistics The foundation of data science is statistics. Having a firm grasp of statistics can help you get greater insight and produce more significant results.

  4. Programming A certain knowledge of programming is necessary to carry out a data science project successfully. Python and R are the most popular programming languages. Because it's simple to learn and provides a variety of libraries for data science and machine learning, Python is particularly well-liked.

  5. Databases A competent data scientist must be familiar with databases' operations, management, and data extraction.

Use of Data Science

  1. With the help of data science, inferences and predictions can be drawn from seemingly unorganized or unrelated data.
  2. Tech companies that collect user data can employ methods to turn that data into profitable or valuable information.
  3. Through the use of data science applications, therapeutic customization is improved through genetic and genomic research.

Data Science Tools

  1. Data Analysis: SAS, Jupyter, R Studio, MATLAB, Excel
  2. Data Warehousing: AWS Redshift
  3. Data Visualization: Jupyter, Tableau
  4. Machine Learning: Azure ML Studio

Applications of Data Science

  • Speech Recognition
  • Image Recognition
  • Internet Search
  • Recommended Systems
  • Healthcare
  • Logistics
  • Gaming
  • Fraud Detection
  • Targeted Advertising
  • AR (Augmented Reality)

Conclusion

Data is actionable knowledge that can make the difference between a company's success and failure. Businesses are now able to predict future growth, identify potential issues, and create successful plans by integrating data science tools.