Excel Linear Regression: Complete Guide to LINEST, SLOPE, INTERCEPT, and the Regression Trendline

Master excel linear regression with LINEST, SLOPE, INTERCEPT, Data Analysis ToolPak, and trendlines. Step-by-step examples, formulas, and interpretation.

Microsoft ExcelBy Katherine LeeMay 21, 202620 min read
Excel Linear Regression: Complete Guide to LINEST, SLOPE, INTERCEPT, and the Regression Trendline

Excel linear regression is one of the most practical statistical techniques you can run inside a spreadsheet, and it answers a deceptively simple question: how does one variable change when another variable changes? Whether you are forecasting sales from advertising spend, modeling exam scores against study hours, or projecting energy use based on temperature, Excel gives you at least five different ways to fit a straight line through your data. Each approach produces the same underlying slope and intercept, but the path you choose changes how much detail you get back about model quality, residuals, and confidence.

The classic equation behind every linear regression in Excel is y = mx + b, where m is the slope, b is the y-intercept, x is your independent variable, and y is the value you are trying to predict. Excel computes m and b using ordinary least squares, which minimizes the sum of squared vertical distances between each observed data point and the fitted line. This is the same math taught in introductory statistics courses, just wrapped inside fast worksheet functions like SLOPE, INTERCEPT, LINEST, FORECAST.LINEAR, and TREND that return results in milliseconds.

Beginners usually start with a scatter chart and a trendline because it is visual and requires no formulas. Intermediate users move to SLOPE and INTERCEPT for quick numeric output, and analysts who need R-squared, standard errors, and F-statistics turn to LINEST as an array formula or to the Data Analysis ToolPak. Each tool has tradeoffs in transparency, recalculation behavior, and how the output looks on the page, and learning when to use which one is what separates a casual spreadsheet user from a confident data modeler.

This guide walks through every method in order of complexity. You will see exactly which cells to click, which formulas to type, and how to read the numbers that come back. We cover single-variable regression first because it is easier to visualize, then expand into multiple regression where LINEST really earns its keep. Along the way we highlight common mistakes, such as forgetting to enter LINEST as an array in older Excel versions or misinterpreting a high R-squared value as proof of causation.

You do not need to install any add-ins to follow the worksheet-function methods, and most examples use sample data sets of fewer than fifty rows so you can recreate every screenshot in under five minutes. If you want fuller diagnostic output with residual plots and normal probability charts, we walk through enabling the Data Analysis ToolPak, which ships free with every desktop version of Excel and unlocks a guided regression dialog. Pair that with a standard deviation formula in Excel to fully describe the spread of your residuals.

By the end you will be able to fit a line, read the coefficients, compute predicted values, evaluate fit quality through R-squared and standard error, and present results in a chart that is ready to drop into a report. We also touch on extensions like polynomial trendlines and logarithmic transformations for cases where a straight line is not the right shape, plus the FORECAST.ETS function for time-series projections.

Linear regression is a foundational skill for analysts, students, scientists, and operations managers, and Excel happens to be the fastest environment in which to learn it. Once the workflow clicks, you will reach for it constantly to test hunches, set baselines, and communicate trends to colleagues who would never open a Python notebook. Let us start with the numbers behind the method.

Excel Linear Regression by the Numbers

📊5Built-in MethodsSLOPE, INTERCEPT, LINEST, TREND, FORECAST.LINEAR
🎯0–1R-Squared RangeCloser to 1 means a better fit
💻64Max Predictor VariablesLINEST supports up to 64 columns of x
⏱️<1 secCalculation TimeFor datasets under 100,000 rows
📈6Trendline TypesLinear, log, polynomial, power, exponential, moving avg
Microsoft Excel - Microsoft Excel certification study resource

Regression Methods at a Glance

📈Chart Trendline

Right-click any data series on a scatter chart and add a linear trendline. Display equation and R-squared on the chart for instant visual results without any formulas, perfect for presentations and quick exploration.

🧮SLOPE and INTERCEPT

Two simple functions that return the regression line's slope and y-intercept as single numbers. Best when you only need m and b for a manual y = mx + b prediction and want lightweight, transparent formulas.

📊LINEST Array Formula

Returns slope, intercept, R-squared, standard errors, F-statistic, and residual sums in a single 5-row block. The professional choice when you need full regression diagnostics inside the worksheet itself.

🎯FORECAST.LINEAR & TREND

Predict y-values for new x-inputs using the same least-squares math. FORECAST handles single predictions; TREND fills an entire range of fitted or projected values along an existing or new x-axis range.

🛠️Data Analysis ToolPak

Built-in add-in that opens a guided dialog producing ANOVA tables, coefficient confidence intervals, residual output, and normal probability plots. The closest Excel gets to dedicated statistical software like SPSS or R.

The fastest way to compute a regression line in Excel is the SLOPE and INTERCEPT pair. SLOPE(known_ys, known_xs) returns m, and INTERCEPT(known_ys, known_xs) returns b. Notice that the y-values come first in both functions, which trips up many first-time users who instinctively type x before y. Once you have m and b, you predict any new value with =m*new_x + b, or you can wrap it inside FORECAST.LINEAR(new_x, known_ys, known_xs) for a one-step calculation that hides the intermediate slope and intercept.

For richer output the LINEST function is the workhorse. Its syntax is LINEST(known_ys, known_xs, [const], [stats]). When you set the fourth argument to TRUE and enter the formula as an array across a 5-row by 2-column range, Excel returns slope and intercept in the first row, their standard errors in the second, R-squared and the standard error of the y-estimate in the third, the F-statistic and degrees of freedom in the fourth, and the regression and residual sums of squares in the fifth.

In Microsoft 365 and Excel 2021, LINEST spills automatically into adjacent cells, so you simply press Enter after typing the formula. In Excel 2019 and earlier, you must first select the full 5-by-2 range, type the formula, and confirm with Ctrl+Shift+Enter. Skipping that combination is the single most common reason users see only the slope value and conclude that LINEST is broken. The output is identical either way once entered correctly, and reading the third-row R-squared value is usually your first stop.

To make the LINEST output readable, label each cell. Put text headers like Slope, Intercept, SE Slope, SE Intercept, R-Squared, SE y-estimate, F-stat, df, SS reg, and SS res in an adjacent column or row. This turns a cryptic 10-number block into a self-documenting mini report you can copy into a dashboard or hand to a stakeholder without further explanation. Combine it with conditional formatting to flag low R-squared values in red and you have an at-a-glance fit indicator.

Many analysts also build helper columns for predicted y, residuals, and squared residuals. Predicted y is simply =slope*x + intercept, residual is =actual_y - predicted_y, and squared residual is the residual raised to the second power. Summing the squared residuals should match the SS res value LINEST returns, which is a useful sanity check that proves your manual model and Excel's internal calculation agree. These helper columns also feed directly into residual plots, which reveal whether the linear assumption holds.

If you prefer to skip formulas, the Insert Chart route is even faster. Create a scatter plot from two columns, right-click any data marker, choose Add Trendline, pick Linear, and tick the boxes for Display Equation on chart and Display R-squared value on chart.

The equation appears in y = mx + b form with full precision, and the R-squared number floats next to it. You can do this in under thirty seconds, which is why it is the most common technique in client-facing presentations. Combine this with techniques from our how to add a filter in Excel guide to slice data first.

For longer datasets, double-check that your x and y ranges are exactly the same length and contain no blank rows or text values. Excel silently ignores text in numeric ranges, which can shorten the effective sample size and skew results without warning. Use COUNT, COUNTA, and ISNUMBER checks before regressing, and if you see any #N/A or #DIV/0! errors in your LINEST output it almost always means there is a non-numeric cell hiding somewhere in your input range that needs to be cleaned up.

FREE Excel Basic and Advance Questions and Answers

Test your knowledge of Excel formulas, functions, and analysis tools with timed practice questions and instant feedback.

FREE Excel Formulas Questions and Answers

Practice LINEST, SLOPE, INTERCEPT, and other statistical formulas with realistic spreadsheet scenarios.

Three Ways to Run Excel Linear Regression

The trendline method is the most visual and beginner friendly approach in Excel. Start by selecting your two columns of data and inserting a Scatter chart from the Insert ribbon. Once the chart appears, right-click any data point, choose Add Trendline, and select Linear from the options panel that opens on the right. Excel instantly draws the best-fit line through your data using the same ordinary least squares math under the hood.

At the bottom of the trendline panel, tick Display Equation on chart and Display R-squared value on chart. The equation appears in the familiar y = mx + b format and R-squared sits next to it. This method is perfect for presentations because the audience sees both the visual fit and the math at once. The drawback is that you cannot easily extract individual coefficients into other cells for downstream calculations or sensitivity analysis.

Excellence Playa Mujeres - Microsoft Excel certification study resource

Excel Linear Regression: Pros and Cons

Pros
  • +Built into every version of Excel with no add-on cost or installation
  • +Multiple methods available from beginner trendlines to professional LINEST arrays
  • +Updates automatically when source data changes, unlike static statistical reports
  • +Easy to share results with colleagues who already have Excel installed
  • +Visual scatter plots and trendlines communicate findings to non-technical audiences
  • +Data Analysis ToolPak provides ANOVA tables and confidence intervals for free
  • +Handles up to 64 predictor variables in multiple regression scenarios
Cons
  • Limited to ordinary least squares, no robust regression or ridge methods built in
  • No automatic detection of outliers or influential points without manual inspection
  • Residual plots in ToolPak are basic and lack interactivity of R or Python output
  • Performance degrades noticeably above a few hundred thousand data points
  • No built-in cross-validation or train-test split workflow for model evaluation
  • Cannot handle categorical variables without manual dummy variable encoding

FREE Excel Functions Questions and Answers

Sharpen your knowledge of statistical, lookup, and math functions including SLOPE, INTERCEPT, and LINEST.

FREE Excel MCQ Questions and Answers

Multiple-choice questions covering Excel basics through advanced regression and data analysis features.

Pre-Analysis Data Checklist for Excel Linear Regression

  • Confirm both x and y columns contain only numeric values with no hidden text or blanks
  • Visualize the data with a scatter plot first to verify a linear pattern is plausible
  • Check that x and y ranges are exactly the same length to avoid silent truncation
  • Remove or investigate obvious outliers that could distort the slope and intercept
  • Make sure observations are independent and not duplicated from the same source
  • Verify that variance of y stays roughly constant across the range of x values
  • Sort by x ascending so residual patterns are easier to spot in helper columns
  • Decide whether to force the intercept through zero based on physical meaning
  • Calculate descriptive stats like mean, median, and standard deviation for context
  • Save a clean copy of raw data on a separate sheet before any analysis transformations

A high R-squared does not prove causation

R-squared measures how much variance in y is explained by your regression line, but a value of 0.95 does not prove x causes y. Always pair statistical fit with domain knowledge, check residual plots for hidden patterns, and confirm the relationship makes physical or business sense before drawing conclusions.

Interpreting the output of Excel linear regression begins with R-squared, which ranges from zero to one and tells you the proportion of variance in y explained by your model. An R-squared of 0.85 means 85 percent of the variation in your dependent variable is accounted for by changes in the independent variable, leaving 15 percent unexplained. Higher is generally better, but acceptable thresholds vary by field. Physics experiments often demand 0.99 or above, marketing analytics may accept 0.40, and behavioral science is happy with 0.20 because human behavior is inherently noisier than mechanical systems.

Standard error of the estimate, returned in row three column two of LINEST, measures the average distance between observed y-values and the regression line in the same units as y. If you are predicting sales in dollars, a standard error of 1,200 means typical predictions miss by about $1,200. This is often more useful than R-squared because it is in real-world units rather than an abstract proportion, and it gives stakeholders an immediate sense of model precision in language they already understand without statistical training.

The F-statistic in row four tests whether the overall regression is significant. Compare it against a critical F-value or compute the p-value with =F.DIST.RT(F_stat, df_reg, df_res). A p-value below 0.05 means the model explains significantly more variance than a flat line through the mean of y. For simple regression with one predictor, this F-test gives the same result as testing whether the slope coefficient is significantly different from zero, which you can also verify by dividing slope by its standard error to get a t-statistic.

Coefficient p-values are not returned directly by LINEST but are easy to compute. Take slope divided by its standard error to get t, then use =T.DIST.2T(ABS(t), df_res) to get the two-tailed p-value. A p-value under 0.05 means the slope is statistically distinct from zero at the 95 percent confidence level. The Data Analysis ToolPak does this calculation automatically and prints it in the coefficient table alongside lower and upper 95 percent confidence intervals, which is one reason the ToolPak is worth enabling for formal reports.

Residual analysis is the next layer of model validation that many spreadsheet users skip. Calculate residuals as actual y minus predicted y, then plot residuals against x or against predicted y. A good linear model produces residuals that scatter randomly around zero with constant variance. Patterns like a curve, a fan shape, or systematic over-prediction at certain x-values indicate the linear assumption is violated and you should consider transforming variables, adding polynomial terms, or moving to a non-linear model entirely.

Confidence intervals around predictions are tighter than prediction intervals around individual observations, and both widen as x moves further from the mean of your training data. This is why extrapolating far outside your observed x-range is risky. A regression fit on study hours between 1 and 10 should not be used to predict exam scores for someone studying 40 hours because you have no data to validate the line still applies. Always note the valid x-range alongside any prediction you publish to stakeholders.

Finally, watch out for multicollinearity in multiple regression. When two predictors are highly correlated with each other, their individual coefficients become unstable and their standard errors balloon, even though the overall model R-squared looks fine. Compute correlations between predictors before regressing and drop or combine variables with correlations above 0.8. Excel's CORREL function or a full correlation matrix from the ToolPak makes this check quick and prevents one of the most common interpretation mistakes in business regression analysis.

Excel Spreadsheet - Microsoft Excel certification study resource

Multiple linear regression in Excel extends the same LINEST function to handle several predictor variables at once. The syntax becomes LINEST(known_ys, known_xs_range, TRUE, TRUE), where known_xs_range is a multi-column block containing each independent variable in its own column. Excel returns slopes in reverse order from right to left across the first row, followed by the intercept in the last position. This reversed ordering catches almost every newcomer off guard, so always label your output cells explicitly with the variable name each coefficient corresponds to.

For example, if you regress sales against advertising spend in column A, store visits in column B, and email opens in column C, the LINEST formula =LINEST(D2:D100, A2:C100, TRUE, TRUE) returns coefficients in the order: email-opens slope, store-visits slope, advertising-spend slope, intercept. Reading this output without labels invites disaster. A common practice is to place the column header names above the corresponding output cells in reverse order so the entire regression is self-documenting and copy-paste safe across worksheets and across team members.

The TREND function complements LINEST by producing fitted or forecast values directly. =TREND(known_ys, known_xs, new_xs) returns predicted y-values for any list of new x inputs you supply, using the same multiple regression coefficients calculated internally. This is much more efficient than computing predictions one cell at a time with manual slope and intercept references. TREND works for both simple and multiple regression depending on whether your known_xs is one column or many, and it spills automatically in modern Excel versions for clean output.

FORECAST.LINEAR is the single-variable cousin of TREND and is best suited for simple regression with one predictor. For time-series forecasting where seasonality matters, switch to FORECAST.ETS, which uses exponential smoothing rather than linear regression and accounts for repeating cycles like monthly sales patterns or daily web traffic rhythms. Excel also offers Forecast Sheet on the Data ribbon, which builds a full chart with confidence bands in two clicks, though it is not appropriate for non-time-series regression problems.

When relationships are not linear, transform your variables before regressing. Take the natural log of y with LN() to handle exponential growth, square x to fit a quadratic pattern, or use 1/x for inverse relationships. Run LINEST on the transformed data and back-transform predictions afterward.

You can also add a polynomial trendline to a chart and display its equation, which gives you the same effect visually without writing transformation formulas. Compare R-squared values across model forms to pick the best fit, but always verify with residual plots that the chosen form is appropriate. For sorting your residuals or filtering by error magnitude, see our guide to how to create a filter in Excel.

Dummy variables let you include categorical predictors in regression. Create a new column for each category minus one and code each row as 1 if the observation belongs to that category or 0 otherwise. For example, to model salary by region across four regions, add three dummy columns leaving one region as the baseline. The coefficients then represent the difference in y between each non-baseline region and the baseline, holding other predictors constant. This is the same technique used in professional statistics packages and works seamlessly with LINEST.

Finally, document your regression workflow inside the workbook so future readers, including your future self, can reproduce the analysis. Add a methods sheet that lists data sources, transformation steps, predictor definitions, and model performance metrics. Include the final equation in plain English. A well-documented regression model is far more valuable than a black-box result, and a few minutes of documentation pays dividends every time someone questions the analysis or needs to update it with new data months down the line.

Practical tips for getting clean, defensible regression results in Excel start with data preparation. Spend more time cleaning than analyzing. Remove duplicate rows, standardize date formats, convert text-stored numbers to true numerics, and verify units are consistent across observations. A regression of revenue measured in thousands against advertising measured in millions will silently produce a slope that is off by three orders of magnitude, and Excel will never warn you. Use COUNT, COUNTA, MIN, MAX, and AVERAGE on every input column to sanity check ranges before regressing.

Save raw data on a protected sheet and do all transformations on copies. This makes it easy to roll back changes if you discover a mistake and lets reviewers verify your work from the original source. Use Excel Tables (Ctrl+T) for your input data because tables automatically expand formula references when new rows are added, which is a huge time saver when you receive monthly data updates and want the regression model to refresh without manually adjusting every range reference scattered across the workbook.

Use named ranges for your x and y inputs to make formulas more readable. Instead of =LINEST(D2:D500, A2:C500, TRUE, TRUE), define names Sales, AdSpend, StoreVisits, and EmailOpens, then write =LINEST(Sales, CHOOSE({1,2,3}, AdSpend, StoreVisits, EmailOpens), TRUE, TRUE) or simply pre-arrange the columns and reference =LINEST(Sales, Predictors, TRUE, TRUE). Named ranges turn cryptic cell references into self-documenting code that other analysts can read without opening the source workbook to figure out which column is which.

Build a regression dashboard that pulls slope, intercept, R-squared, standard error, and key p-values into a single summary block at the top of your sheet. Include a small line-fit chart, a residual plot, and a prediction calculator where users can enter new x-values and instantly see predicted y-values with confidence bands. This turns a one-time analysis into a living tool that stakeholders can interact with, which dramatically increases the perceived value of your work and ensures the model gets used rather than archived.

Validate your model against held-out data before publishing. Take 20 percent of your observations and set them aside before fitting. Run LINEST on the remaining 80 percent, then use TREND or FORECAST.LINEAR to predict y for the held-out x-values and compare predictions against actuals. Calculate root mean squared error as SQRT(SUMSQ(actual - predicted)/n) and compare it to the standard error from the training fit. If they are similar, your model generalizes; if test error is much higher, you have overfit and need a simpler model.

Document assumptions in plain language alongside the numbers. State explicitly that the regression assumes a linear relationship, independent observations, normally distributed residuals, and constant variance. Note the x-range over which the model is valid and any data quality caveats. Stakeholders rarely read footnotes, but they reference them when challenged, and having assumptions written down protects you when someone tries to use the model in conditions it was never validated for, like predicting next year's results from last quarter's data.

Finally, keep learning. Excel regression is a gateway to deeper statistical modeling, and once you are comfortable with LINEST and the ToolPak, the next steps are logistic regression for binary outcomes, time-series ARIMA models for forecasting, and eventually moving to R or Python for machine learning. The mental model you build by working through least squares manually in Excel transfers directly to every other regression tool you will use later, which makes the time invested in mastering Excel regression genuinely cumulative rather than throwaway.

FREE Excel Questions and Answers

Comprehensive Excel certification practice covering formulas, functions, pivot tables, charts, and statistical analysis.

FREE Excel Trivia Questions and Answers

Fun trivia-style questions to test your Excel knowledge across versions, shortcuts, features, and history.

Excel Questions and Answers

About the Author

Katherine LeeMBA, CPA, PHR, PMP

Business Consultant & Professional Certification Advisor

Wharton School, University of Pennsylvania

Katherine Lee earned her MBA from the Wharton School at the University of Pennsylvania and holds CPA, PHR, and PMP certifications. With a background spanning corporate finance, human resources, and project management, she has coached professionals preparing for CPA, CMA, PHR/SPHR, PMP, and financial services licensing exams.