Ok, the why is easy ... "because we can". But is this even practical, or needed.
So ok... this will let Canon deliver a sub-35mm DSLR with resolution twice that of the best digital Hasselblad. But that comes at a price... tiny pixels. 2.2mm x 2.2mm pixels.. about the size of your lower-end digital P&S or consumer camcorders. That means that 120Mpixel is only something you get in bright sunlight. When it gets dark... well, better have some good pixel averaging and noise reducing tech, and the means to convince the user to use much less resolution.
Then there's "what's the point anyway". This is going to deliver a resolution of about 230lp/mm. If you look at most professional SLR lens tests, the better lenses are holding their own on MTF tests at 40lp/mm. For example, the very sharp (and expensive) Canon f2.8 28-70mm lens did a 70% on the MTF40 test at f8, and delivered a MTS lens score of 61lp/mm (that's the point at which the lens delivers a 50% contrast reduction). So this $1000+ lens on a Canon with that 230lp/mm lens will deliver... something slightly below 61lp/mm of resolution. Peak. System resolution is a convolution function.. you never get better than your weakest link.
So I really do wonder. In professional audio, we all have 24-bit gear these days. Only, most of this gear doesn't really offer a 144dB S/N ratio. The difference is what we call "marketing bits"... you get a real 20-bit out of the unit, those extra 4 bits are to sell it against all the other 24-bit gear on the market.
Unless Canon has a radically new lens technology, 120Mpixels is a big steaming pile of marketing bits. That Hasselblad I mentioned... they'll get substantially more out of their 60Mpixel sensor. Not because of better lenses (though they do have some epic lenses... I could buy one or two, if I sold my car), but because the sensor is much larger. So they don't need the same lp/mm from the lens.. they have more mm on the sensor.
So I do wonder... this sounds very cool. But once you scratch the surface, I'm getting a sensor that's going to be crap, or far lower resolution, once I'm away from ideal lighting. No lens in existence is going to deliver more than about 1/4 of that effective resolution... so why not just make the pixels larger.
For video, no one cares about a full HD image in 1/60th of the sensor space (particularly when you have the color distortions inherent in Bayer interpolation). For video, I want super low-light performance and shallow DOF in a DSLR. Ok, I'll take a 4K video mode if you've got it. Or something other than AVCHD, but I'm not holding my breath. How about a few actual video-friendly features, like audio levels, large files (>4GB), and no overheating or stopping every 10-12 minutes.