Hint for physicists
 Very often, the data to be fitted is a histogram of physical events. In that
 case, since each bin would follow a multinomial distribution, the error is
 equal to √f, where f is the expression you are trying to fit.
 Of course, since you don't know the parameter values yet, you don't actually
 know f, so you approximate by using the y data values.  In the
 limit, these results are the same. In the case of a large number of bins, the
 variance can be approximated by √y. Hence, the correct
 weighting factor that will give properly normalized errors is w = 1/y,
 and the corresponding one standard deviation error,
 σ = E1.
 E2 = E1*sqrt(χ2/n), where
 E1 is the standard error and n is the number of degrees of freedom, usually
 equal to the number of data points minus the number of parameters,
 (N-M).