// Apologies, since I do not know how to type c to the power of 4) //
Simply work in units where c = 1 and all your problems will be solved :)
One point that often gets overlooked -- and curse the days that E = mc² became the standard, because it tends to bake this view in -- is that the more physical formulation isn't E² = p² + m², but rather E² - p² = m². This innocent rearrangement is in fact crucial, for reasons that I hope this post will get to. Anyway, allow me to try a different, and probably still circular, tack. The short version is: it's just much more useful to redefine momentum away from mass x velocity than it is to not do that.
The first point is that physics is generally the study of things that we can actually measure. Moreover, because the world is complicated, the drive is always to find things that we can easily predict as well as measure. And the easiest thing to predict, and then compare to measurements, is anything that stays constant through the whole mess of complexity. Early in history, two such quantities were found, namely energy and momentum. It's well-known that "energy cannot be created or destroyed, only transferred between different forms". Likewise, the total momentum, here meaning mass times velocity, was seen to stay constant. So let's take it as a given that:
1. momentum= mass times velocity is a useful quantity, because
2. it is conserved in physical processes eg in collisions.
Now we need to confront ourselves with an interesting observation. Photons do indeed have no mass (here we don't care that this is because they're travelling at the speed of light, we've just measured it). Therefore, we can suppose, they have no momentum. But what happens, for example, when an electron absorbs/emits a photon? This happens all the time (eg Compton scattering, which we observe all the time in radiotherapy as a medical application). The answer is that the electron's momentum changes. But we argued that momentum is conserved. How can this be?
There are two choices here:
1. We can abandon the idea that momentum is conserved -- or at least we can just assume that it isn't conserved in this case.
2. We can relax the definition of momentum so that photons can have it after all, and the conservation law still applies with this new view.
Which is more useful? Answer: 2! If, as Zebu's formula says, you associate the momentum with the frequency, then you can perfectly account for Compton scattering by applying conservation of momentum to predict the change in the frequency of the light.
In that sense, the definition momentum = mass times velocity was only introduced first because we weren't able to appreciate truly what momentum was when we first came across it. It works for objects with mass, travelling at low speeds, but doesn't extend far enough to be useful any more at high speeds or for objects without mass.
* * * *
As to E² - p² = m² : again, you can apply the same idea. Physicists hunt for quantities that are fixed. In this case, the question is: "what is one value that all observers, no matter how fast they are moving, will agree on for a moving particle?" It's precisely the combination
E² - p² that everybody measures the same way, and it's precisely that conversation that is identified with the mass of a particle. For photons, the mass equals zero, so that this reduces to E = p. For anything else, at least one observer will see the particle at rest, so that for that observer they can say that E = m (or mc², if they want to be boring).
I have no intention of proving that this quantity indeed stays constant, but it's useful to bear in mind that the natural way the equation should appear, E² - p² = m², tells us far more about where this equation comes from than any other arrangement.
More later. But I hoped this would act as a useful extra to Zebu's approach.