LAB 4A: If the Line Fits…
Lab 4A - If the line fits ...
Directions: Follow along with the slides, completing the questions in blue on your computer, and answering the questions in red in your journal.
How to make predictions
- 
Anyone can make predictions.
– Data scientists use data to inform their predictions by using the information learned from the sample to make predictions for the whole population.
 - 
In this lab, we'll learn how to make predictions by finding the line of best fit.
– You will also learn how to use the information from one variable to make predictions about another variable.
 
Predicting heights
- 
(1) Write and run code using the
data()function to load thearm_spandata. - 
This data comes from a sample of 90 people in the Los Angeles area.
– The measurements of
heightandarmspanare in inches.– A person's
armspanis the maximum distance between their fingertips when they spread their arms out wide. - 
(2) Write and run code making a plot of the
heightvariable.– (3) If you had to predict the height of someone in the Los Angeles area, what single height would you choose and why?
– (4) Would you describe this as a good guess? What might you try to improve your predictions?
 
Predicting heights knowing arm spans
- 
(5) Write and run code creating two subsets of our
arm_spandata:– One for
armspan >= 61andarmspan <= 63.– A second for
armspan >= 64andarmspan <= 66. - 
(6) Write and run code creating a
histogramfor theheightof people in each subset. - 
Answer the following based on the data:
– (7) What
heightwould you predict if you knew a person had anarmspanaround 62 inches?– (8) What
heightwould you predict if you knew a person had anarmspanaround 65 inches?– (9) Does knowing someone's
armspanhelp you predict their height? Why or why not? 
Fitting lines
- 
Notice that there is a trend that people with a larger
armspanalso tend to have a larger meanheight.– One way of describing this sort of trend is with a line.
 - 
Data scientists often fit lines to their data to make predictions.
– What we mean by fit is to come up with a line that's close to as many of the data points as possible.
 - 
(10) Write and run code creating a scatterplot for
heightandarmspan. Then run the following code.add_line() - 
On the Plot pane, click two data points to draw a line through.
 - 
NOTE: Watch the following video if you are experiencing difficulties obtaining your line https://youtu.be/pGqXHGhhwJ8
 - 
If you are unsuccessful using the
add_line()function, refer to the next slide to learn how to use theget_line()function. 
get_line()
- 
The
get_line()function does not rely on clicking on the scatterplot to choose points, but rather on you providing the points manually. - 
For example, let's say you want to obtain the equation of the line that passes through the points (59,60) and (68,67). This is how you would use the
get_line()function:get_line(c(59,60), c(68,67)) ## intercept slope ## 14.1111111 0.7777778 - 
Notice the output is the y-intercept and the slope of your line.
 - 
Now you can use the
add_line()function to include the line in your scatterplot.add_line(intercept = 14.1111111, slope = 0.7777778) - 
If your line doesn't quite fit the way you want it, try another ordered pair or make modifications to the existing equation.
 
Predicting with lines
- 
(11) Draw a line that you think is a good fit and write down its equation.
 - 
(12) Using your equation: Predict how tall a person with a 62-inch
armspanand a person with a 65-incharmspanwould be. - 
Using a line to make predictions also lets us make predictions for
armspans that aren't in our data.– (13) How tall would you predict a person with a 63.5-inch
armspanto be?– (14) Compare your answers with a neighbor. Did both of you come up with the same equation for a line? If not, can you tell which line fits the data best?