RE: (Labelling)13 Nov 2019 22:50
JC, I have been thinking.
Apologies, my first thoughts on labelling was that there is a whole world outside of the vehicle and that is a lot to label - is it a bike, a kangaroo, a human with a bike or an umbrella or shopping bags. Walking with a child a dog or horse etc. Having a single researcher driving to generate data didn't seem to make sense, but that would only be part of the job. It could be 2 weeks on a test site with volunteers, but some driving would be required. those tests would require labelling.
I then read the See patent and started to get more excited as I realised that we have 3 billion km of driving data - yes most is of drowsy drivers, but here is the key part that I forgot about: many or even most gen 2 Guardian's have forward facing cameras. Sure they only record for the seconds before and after an event, but we have road data. Some is triggered by the driver pressing the button to capture "the idiot in front" others will be of drivers not looking, but given the vast volume of data a reasonable amount of snippets should be usable with a driver looking ahead and the front camera seeing the view in front.
Ok, here comes the best bit. Guardian has the front and driver facing cameras mounted in the same unit with fixed geometry - far better than a car where the cameras could be separated and be at different angles in different vehicles. In Guardian, they are effectively on the same height (y axis) same forward distance (z axis) and on the X axis they are only separated by a few fixed cm (parallel to the axle) SEE also know the relative angles (say ~180 degrees apart), but again it is fixed.
Now we have a nice data set to work with. Now as discussed in the Patent, they still have to find the driver and the eyes and determine the gaze direction (this is SEE after all). The eye positions in the 2-d picture are done, then you need the distance to the eyes well we know the size of an eye, so that is another triangle and you have the distance. Now take the gaze from the eyes and draw that forward and work out where it lands on the 2-d forward facing image.
This bit is a bit trickier, so we use a bit of cunning - roads are "generally" horizontal (they mentioned measuring a slope) but I was a physicist, so I will assume an infinite horizontal surface for now. Lets say that the truck driver's eyes are 6 foot up and the guardian is at roughly the same height (only 1 foot above or below). Now you can intersect the gaze with the perfect horizontal road. Re do the maths for different heights and separations of the guardian and the fixed point stretches from a circle to an ellipse. Maybe 2-5 metres long and 1 metre wide. But on the 2-d camera image that corresponds to a more circular blob. Now assume that there was a truck in the way with a nice vertical back. The maths still works out the with the same circular blob location.
Now just need to label roads or trucks. Add in rapid movement laterally and see if eye follow