define all types of basic math like data
Answers
Answer:
please mark me as brainliest my dear my frienf
Step-by-step explanation:
Math is like an octopus: it has tentacles that can reach out and touch just about every subject. And while some subjects only get a light brush, others get wrapped up like a clam in the tentacles’ vice-like grip. Data science falls into the latter category. If you want to do data science, you’re going to have to deal with math. If you’ve completed a math degree or some other degree that provides an emphasis on quantitative skills, you’re probably wondering if everything you learned to get your degree was necessary. I know I did. And if you don’t have that background, you’re probably wondering: how much math is really needed to do data science?
In this post, we’re going to explore what it means to do data science and talk about just how much math you need to know to get started. Let’s start with what “data science” actually means. You probably could ask a dozen people and get a dozen different answers! Here at Dataquest, we define data science as the discipline of using data and advanced statistics to make predictions. It’s a professional discipline that’s focused on creating understanding from sometimes-messy and disparate data (although precisely what a data scientist is tackling will vary by employer). Statistics is the only mathematical discipline we mentioned in that definition, but data science also regularly involves other fields within math. Learning statistics is a great start, but data science also uses algorithms to make predictions. These algorithms are called machine learning algorithms and there are literally hundreds of them. Covering how much math is needed for every type of algorithm in depth is not within the scope of this post, I will discuss how much math you need to know for each of the following commonly-used algorithms:
Naïve Bayes’ Classifiers
What they are: Naïve Bayes’ classifiers are a family of algorithms based on the common principle that the value of a specific feature is independent of the value of any other feature. They allow us to predict the probability of an event happening based on conditions we know about events in question. The name comes from Bayes’ theorem, which can be written mathematically as follows:
where and are events and is not equal to 0. That looks complicated, but we can break it down into pretty manageable pieces:
is a conditional probability. Specifically, the likelihood of event A occurring given that is true.
is also a conditional probability. Specifically, the likelihood of event occurring given the is true.
and are the probabilities of observing and independently of each other.
Math we need: If you want to understand how Naive Bayes classifiers works, you need to understand the fundamentals of probability and conditional probability. To get an introduction to probability, you can check out our course on probability. You can also check out our course on conditional probability to get a thorough understanding of Bayes’ Theorem, as well as how to code Naive Bayes from scratch.
Linear Regression
What it is: Linear regression is the most basic type of regression. It allows us to understand the relationships between two continuous variables. In the case of simple linear regression, this means taking a set of data points and plotting a trend line that can be used to make predictions about the future. Linear regression is an example of parametric machine learning. In parametric machine learning, the training process ultimately enables the machine learning algorithms is a mathematical function that best approximates the patterns it found in the training set. That mathematical function can then be used to make predictions about expected future results. In machine learning, mathematical functions are referred to as models. In the case of linear regression, the model can be expressed as:
where , , , represent the parameter values specific to the data set, , , , represent the feature columns we choose to use in out model, and represents the target column. The goal of linear regression is to find the optimal parameter values that best describe the relationship between the feature column and the target column. In other words: to find the line of best fit for the data so that a trend line can be extrapolated to predict future results. To find the optimal parameters for a linear regression model, we want to minimize the model’s residual sum of squares. Residual is often referred to as the error and it describes the difference between the predicted values and the true values. The formula for the residual sum of squares can be expressed as:
(where are the predicted values for the target column and y are the true values.) Math we need: If you want to scrape the surface, a course in elementary statistics would be fine. If you want a deep conceptual understanding, you’ll probably want to know how the formula for residual sum of squares in derived, which you can learn in most courses hope it helps