Creating A Scatter Plot in Excel
Creating A Scatter Plot in Excel
Creating A Scatter Plot in Excel
Objectives
• Enter and format data in an Excel spreadsheet in a form appropriate for graphing
• Create a scatter plot from spreadsheet data
• Insert a linear regression line (trendline) into the scatter plot
• Use the slope/intercept formula for the regression line to calculate a x value for a known y
value
• Explore curve fitting to scatterplot data
• Create a connected point (line) graph
• Place a reference line in a graph
Introduction
Beer's Law states that there is a linear relationship between concentration of a colored compound
in solution and the light absorption of the solution. This fact can be used to calculate the
concentration of unknown solutions, given their absorption readings. First, a series of solutions of
known concentration are tested for their absorption level. Next, a scatter plot is made of this
empirical data and a linear regression line is fitted to the data. This regression line can be
expressed as a formula and used to calculate the concentration of unknown solutions.
Open Excel. On Unity/Eos computers, the program will be located on the Application Launcher.
On other computers, it will probably be located under the Start Menu.
• Click and drag over the range of cells that will hold the concentration data (A5 through
A10 for the sample data)
• Choose Format > Cells... (this is shorthand for choosing Cells... from the Format menu
at the top of the Excel window)
• Click on the Number tab
• Under Category choose Number and set Decimal places to 5
• Click OK
• Repeat for the absorbance data column (B5 through B10 for the sample data), setting the
decimal places to 4
The last step before creating the graph is to choose the data you want to graph.
• Highlight the data in both the concentration and absorbance columns (but not the
unknown data)
With the data you want graphed, start the chart wizard
• Choose the Chart Wizard icon from the tool bar. If the Chart Wizard is not visible, you
can also choose Insert > Chart...
• Choose XY (Scatter) and the unconnected points icon for the Chart sub-type
The Data Range box should reflect the data you highlighted in the spreadsheet. The Series
option should be set to Columns, which is how your data is organized.
The next dialogue in the wizard is where you label your chart
Keep the chart as an object in the current sheet. Note: Your current sheet is probably named with
the default name of "Sheet 1".
• Click Finish
The initial scatter plot is now finished and should appear on the same spreadsheet page as your
original data. A few items of note:
With your graph highlighted, you can click and drag the chart to a wherever you would
like it located on the spreadsheet page. Grabbing one of the four corner handles allows
you to resize the graph. Note: the graph will automatically adjust a number of chart
properties as you resize the graph, including the font size of the text in the graph. You
may need to go back and alter these properties. At the end of the first part of this tutorial,
you will learn how to do this.
When the chart window is highlighted, besides having the chart floating palette appear, a Chart
menu also appears. From the Chart menu, you can add a regression line to the chart.
Using the linear equation, a spreadsheet cell can have an equation associated with it to do the
calculation for us. We have a value for y (Absorbance) and need to solve for x (Concentration).
Below are the algebraic equations working out this calculation:
y = 2071.9x + 0.111
y - 0.0111 = 2071.9x
(y - 0.0111) / 2071.9 = x
Now we have to convert this final equation into an equation in a spreadsheet cell. The 'B12' in the
equation represents y (the absorbance of the unknown). The solution for x (Concentration) is then
displayed in cell 'C12'.
• Highlight a spreadsheet cell to hold 'x', the result of the final equation (cell C12, labeled
B).
• Click in the equation area (labeled C)
• Type an equal sign and then a parentheses
• Click in the cell representing 'y' in your equation (cell B12) to put this cell label in your
equation
• Finish typing your equation
Note: If your equation differs for the one in this example, use your equation
• Highlight the original equation cell (C12) and the cell below it (C13)
• Choose Edit > Fill > Down
Note that if you highlight your new equation in C13, the reference to cell B12 has also
incremented to cell B13.
The readability and display of the scatterplot can be further enhanced by modifying a number of
the parameters and options for the chart. Many of these modifications can be accessed through
the Chart menu, the Chart floating palette, and by double-clicking the element on the chart itself.
Let's start by creating a better contrast between the data points and regression line and the
background.
• Double-click in the gray background area of the chart or by selecting Chart Area in the
Chart floating palette and then clicking on the Format icon.
In the Chart Area Format dialogue, set the border and background colors
• Click on the horizontal grid lines in the chart and press the Delete key
Now, adjust the color and line weight of the regression line and the color of the data points
• Double-click on the regression line (or choose Series 1 Trendline 1 from the Chart
floating palette and then click the Format icon)
• Choose a thinner line for the Line Weight
• Click on the word Automatic next to Line Color and the color palette appears. Choose
dark blue from the color palette
• Click OK
• Double-click on one of the data points (or choose Series 1 and click the Format icon)
• Choose dark red from the color palette for the Marker Foreground and Background
• Click OK
Finally, you can move the regression equation to a more central location on the chart
If necessary, resize the font size for text elements in the graph.
• Either double click the text element or choose it from the floating palette
• Click on the Fonts tab
• Choose a different font size
This is the end of the first half of the scatter plot tutorial.
In this next part of the tutorial, we will work with another set of data. In this case, it is of a strong
acid-strong base titration. With this titration, a strong base (NaOH) of known concentration is
added to a strong acid (also of known concentration, in this case). As the strong base is added to
solution, its OH- ions bind with the free H+ions of the acid. An equivalence point is reached when
there are no free OH- nor H+ ions in the solution. This equivalence point can be found with a
color indicator in the solution or through a pH titration curve. This part of the tutorial will show you
how to do the latter.
• Using a new page in the spreadsheet, enter your titration data. If you do not have your
own data, use the data shown in Figure 1.
• Return to the beginning of the tutorial if you need hints on formatting the cells to the
proper number of decimal places
Figure 1.
Now, create a scatter plot of titration data, just as you did with the Beer's Law plot.
• The defaults for step 2 should be fine if you properly highlighted the data
• In step 3 enter the chart title and x and y axis labels and turn off the legend
• In step 4, leave as an object in the current page
The next logical question that you might ask is whether a linear regression line or a curved
regression line might help us interpret the titration data. You may remember that our goal with this
plot is to calculate the equivalence point, that is, what amount of NaOH is needed to change the
pH of the mixture to 7 (neutral)?
Looking at the data, it is clear that the first 45 ml of NaOH do little to alter the pH of the mixture.
Then between 45 ml and 55 ml, there is a sharp rise in pH before leveling off again. The data
trend does not seem linear at all and, in fact, a linear regression line does not fit the data well at
all.
• Click on the linear regression line in the plot and press the delete key to delete the line
• Choose Chart > Add Trendline...
• Pick Polynomial subtype
• Set the Order of the curve to 2
You can see that a second order polynomial curve does not capture the steep rise of the data
well. A higher order curve might be tried:
Still, the third order polynomial does not capture the steep part of the curve where it passes
through a pH of 7. Even higher order curves could be created to see if they fit the data better.
Instead, a different approach will be taken for this data. Go ahead and delete the regression
curve:
• Click on the curved regression line in the plot and press the delete key
Instead of adding a curved regression line, all of the points of the titration data are connected with
a smooth curve. With this approach, the curve is guaranteed to go through all of the data points.
This is both good and bad. This option can be used if you have only one pH reading per
amount of NaOH added. If you have multiple pH readings for each amount added on the scatter
plot, you will not end up with a smooth curve. To change the scatter plot is a (smoothed) line
graph.
This smooth, connected curve helps locate where the steep part of the curve passes through pH
7.
The chart can be enhanced by adding a reference line at pH 7. This clearly marks the point where
the curve passes through this pH.
• A set of drawing tools should be visible at the bottom of the window. If not, click on the
Draw icon two to the right of the Chart wizard icon.
• Make sure your chart is highlighted
• Choose the line tool at the bottom of the window
• Draw a horizontal line at pH 7 across the width of the chart by clicking and dragging a
line across the chart area.
• With the horizontal line still highlighted, choose a 3/4 pt line thickness and a dashed
line type at the bottom of the window
Further refinements in the chart can be made by (as you did with the Beer's law chart):
The above chart gives a good overview of the entire titration. If you would like to focus exclusively
on the steep part of the curve between 45 and 55 ml of added NaOH, a new chart can be created
which limits the X Axis range. Start by making a copy of the current chart:
Next, both vertical and horizontal gridlines can be added to more accurately locate the
equivalency point:
Even with this smooth curve passing through all of the data points, it is still an estimation of what
intermediate mL added/pH data points would be. A clear inaccuracy is where the curve moves in
a negative X direction between the 50 and 51 mL data points. More data points collected between
49 and 51 mL would both better smooth the curve and give a more accurate estimation of the
equivalency point.