Correlation is the relationship or association between two variables. There are multiple ways to measure correlation, but the most common is Pearson's correlation coefficient (r), which tells you the strength of the linear relationship between two variables. The value of r has a range of -1 to 1 (0 indicates no relationship). Values of r closer to -1 or 1 indicate a stronger relationship and values closer to 0 indicate a weaker relationship. Because Pearson's coefficient only picks up on linear relationships, and there are many other ways for variables to be associated, it's always best to plot your variables on a scatter plot, so that you can visually inspect them for other types of correlation.
It's important to remember that correlation does not always indicate causation. Two variables can be correlated without either variable causing the other. For instance, ice cream sales and drownings might be correlated, but that doesn't mean that ice cream causes drownings—instead, both ice cream sales and drownings increase when the weather is hot. Relationships like this are called spurious correlations.
Regression is a statistical method for estimating the relationship between two or more variables. In theory, regression can be used to predict the value of one variable (the dependent variable) from the value of one or more other variables (the independent variable/s or predictor/s). There are many different types of regression, depending on the number of variables and the properties of the data that one is working with, and each makes assumptions about the relationship between the variables. (For instance, most types of regression assume that the variables have a linear relationship.) Therefore, it is important to understand the assumptions underlying the type of regression that you use and how to properly interpret its results. Because regression will always output a relationship, whether or not the variables are truly causally associated, it is also important to carefully select your predictor variables.
Simple linear regression estimates a linear relationship between one dependent variable and one independent variable.
Multiple linear regression estimates a linear relationship between one dependent variable and two or more independent variables.
If you do a subject search for Regression Analysis you'll see that the library has over 200 books about regression. Select books are listed below. Also, note that econometrics texts will often include regression analysis and other related methods.