Facial Recognition in sport venues – an introduction

By June 12, 2018Long Reads

CEO, Tinus le Roux writes:

Lately, most articles on facial recognition strike a similar tone to those viral videos of the creepy robotic dogs opening doors for themselves.
The main takeaway in both: The end is nigh!
This technology is going to kill us— we just don’t know when.

The reality is not quite as dramatic and a basic understanding of the limitations of facial recognition will help a great deal in understanding why. More importantly, it will allow us to have more constructive discussions about the ethics surrounding the topic. (Unfortunately there’s nothing we can do about those robot dogs —they’re definitely going to kill us.)

So, while we wait for Skynet to be switched on, here’s a quick non-technical introduction to facial recognition, just to pass the time.
Let’s do it in three parts: An overview, an analogy and some general notes.


A non-technical, technical overview

What is commonly referred to as ‘Facial Recognition’ actually refers to two separate computational processes. Facial Analysis and Facial Comparison.

Facial Analysis is the process of scanning a face and returning demographics information as a result. Usually only gender, age and mood, but some companies provide race as a metric too. (See relevant note in 3rd section.)

Some people would call the return data anonymized, but it is in fact anonymous. This process has NOTHING to do with identity at all, it’s simply looking at facial structure, skin tone and other measurements to make a (very educated) guess on the gender, age and mood of the face in question.

Facial Comparison is the process of comparing two faces and returning a similarity score based on measurements of key aspects of facial structure.
It’s obviously a bit more complex than that, but practically speaking you feed two faces into a system and it returns a result in the form of a percentage, ex: these two faces are 95% similar.

Still, no real ‘recognition’, no personally identifiable information.
No data to anonymise.

That’s it, those are the only two things ‘Facial Recognition’ can do:
Demographic analysis and Facial Comparison.
So, where’s the part where they find the criminal in the crowd or Facebook tells me who’s in the picture I just uploaded?

Well, to do that, you need a very important additional dataset.
One that has almost nothing to do with the technical process of ‘facial recognition’ explained above.

You need a database of faces assigned to identities.

This is the key ingredient. Without it you can spin up the most sophisticated deep-learning neural networks on quantum computing platforms you can imagine, but you won’t match a face with an identity without having access to a database of identified faces.

Yes, it does seem obvious (because it is), but in the chaos that inevitably comes with new technology it’s a point that seems to get lost quite easily and often leads conversations about facial recognition down the wrong rabbit holes. (see notes on Amazon and Police Department collaboration in 3rd section)


Counting Cars

Before we go down one ourselves, let’s circle around one more time and have a look at this non-technical, technical explanation through the lens of an analogy. Here we go …

You’re asked to stand on the side of your street and count cars for a day.
The goal is to compile a list of the type of cars that drive by and then to use it to see if we can learn something about your neighborhood.
Clip board and pencil in hand (because you’re old school that way) you put on your yellow vest and take up your position on the side of the road:
1 x Toyota ; 1 x Ford ; 1 x Chevrolet ; 1 x Toyota ; 1 x Toyota ; 1 x Tesla ; and so on and so on…

By the end of the day you collate the data and produce a report. A simple bar graph, with each bar representing a car manufacturer and showing the number of vehicles seen for each.
It’s not very interesting in itself, but when compared to similar reports from other streets around the country, you’re able to get a good picture of the economic composition of your community.

Based on the comparison and known research you’re able to make some deductions on the kind of shops that would do well in your neighborhood. You can go further and predict what … well, that’s not really the point.
The point is that you can make commercial decisions based on the data and that it’s completely anonymous data.

It has not been anonymized, you never gathered personal identifiable data, you have absolutely no idea who drove past today and consequently didn’t encroach on anyone’s right to privacy.
Counting cars = Valuable data & no privacy concerns.

This is analogous to capturing demographic data through facial analysis – a process which can safely be used by stores, venues and in other public spaces to get a better understanding of customers without any concerns of disrespecting the right to privacy of anyone.

Sorry to send you out again, but the analogy requires one more dab of sunscreen and a new sheet on that clipboard of yours.
This time we need you to also to capture all the licence plate numbers you see. Go!

By the end of day two you have some really interesting data.
First thing you notice is that 68% of the numbers are recurring numbers.
It must be people dropping their kids at school and then coming back the same way. That’s interesting, but what’s even more interesting is when you read it along with the manufacturer data.
Of the 68% of recurring cars, 83% happen to be Toyotas.
This shows that yesterday’s data stating that Toyota had a 40% market share in your neighborhood was wrong, because we double counted a lot of those cars. So, while there are fewer Toyota’s in your neighborhood than originally thought, it’s also clear that they are disproportionately popular for picking up kids at school.
We can go on and deduce fascinating fake stats like this all day, but the point is that the additional set of data included unique identifiers and these allowed you to compare cars and establish ‘similarity’. Adding similarity metrics to your dataset made it infinitely more valuable.

What’s also important is that despite having captured unique identifiers your data is still anonymous. You have not tracked anyone in any way and no-one’s privacy has been compromised.

For your data to go from interesting to unethical you need something else and it has nothing to do with your data capturing capabilities.
You need a list of licence plate numbers and their corresponding owners.
If you combine that list with your data, you will be able to track your neighbors.

Bringing this back to ‘facial recognition’:

Counting car types is similar to extracting demographic information of customers though facial analysis. It sounds a bit creepy, but really isn’t. It doesn’t encroach on anyone’s right to privacy and if used intelligently, can provide really valuable insights.

Matching number plates is similar to using comparison analysis on faces. It sounds like recognition, but really isn’t.
It can provide valuable insights in crowd composition, but is still completely anonymous.

Matching identities to number plates or faces — now you’ve gone to the dark side. Definite privacy concerns here and it can only be justified under very specific circumstances or with explicit consent of individuals.


General notes and FAQ’s

The goal of this post is to provide a quick and easy introduction into ‘facial recognition’ and do so from a practical non-technical perspective.
Hopefully that’s has been covered above.
This 3rd section is simply an addendum of random thoughts on the topic and also gives me an opportunity to answer some of the most common questions I receive from #sportsbiz folks.

  • Q: Who’s the best at facial recognition?
    You’ll find many companies trying to stake a claim to be ‘the best at facial recognition’, but the reality is that there isn’t a clear leader and I don’t expect there ever will. Here’s the bottom line: Both facial analysis (demographics) and similarity (comparison) are solved problems.
    To my knowledge there is no single company in the world that owns a piece of IP that makes their efforts in this regard substantially more accurate than others. The original analytical algorithms are published and the only difference between platforms at this point is the quality of their training data. Facial recognition as a service is therefore a ‘race to the bottom’. (comparable products competing on costs not quality of results)
    I’m going to have the facial recognition service providers shout at me for this one, but from a practical point of view the differences between the top guys are negligible and I predict will become even smaller in future.
  • Notes: Crowd Demographic vs Personal Identifiable Information (PII)
    One of the biggest mistakes I see in Sports Business today is an overemphasis on PII. It’s the same mistake I saw 5 years ago when everyone was talking about installing bluetooth beacons in venues. “These will allow us to communicate with everyone in the building at any time”. So here’s why that didn’t happen (and why PII is overrated): When it comes to communication, the big boys (Apple, Samsung, Google) will always put their customers first. No matter how dearly and genuinely teams, brands or organisations would like to communicate with their clients, those who power the handsets will always allow their customers, the individual, to control incoming communication.
    Imagine what you would do if your home screen suddenly became a carousel of auto-play video ads.
    So, the fact that your iPhone can pick up a bluetooth beacon does not mean teams will suddenly be able to communicate with you without your consent — and without clear utility provided by said teams, you’re not going to give that consent.
    This is why you spend so much time unsubscribing from most emails — the nuisance outweighs the potential utility or use.
    Having the PII of all your fans is not the be-all and end-all of Sportsbiz!!
    Sorry for shouting, but it’s just so frustrating to see these cycles repeat themselves and people being …. not-smart .. about it.
    Understanding your fans and their needs is much more important and valuable than ‘owning their data’ — and this is where accurate demographics and similarity metrics will change the industry.
    Anonymous, non-creepy data that allows teams to do A-B testing, pick up trends, do predictive analysis and ultimately understand their fan base as a community is much more valuable than a CSV file with email addresses.
    I’ve already done more ‘shop talk’ in this section than I wanted to, so let’s just leave it at this: You don’t own your fan’s contact details, it’s their data and they will control how you’re allowed to communicate with them.
    Rather focus everything you have on understanding your fans so that when you do communicate (directly or indirectly) they don’t unsubscribe — forever.
  • Notes: “Amazon is selling police departments a real-time facial recognition system
    This story broke a couple of weeks ago.
    Politicians were quick to focus on Amazon, but I think they’re missing a trick.
    There’s really no point going after Amazon and their Rekognition product.
    Firstly because they’re not the only ones offering such a product.
    Google, NEC, IBM and many others offer the same services.
    So stopping Amazon doesn’t stop the problem.
    Secondly, Rekognition isn’t harmful in itself, it is only potentially harmful if combined with an existing database of faces assigned to identity.
    If the Police Department has such a database the use of it should definitely be investigated and possibly regulated — not Amazon.
    It’s a complex issue and I’m not going to weigh in on the merits or legality of Police surveillance here, but what I can say is that chasing Amazon is comparable to chasing Canon because the Police buy cameras from them. The discussion should be about how law enforcement should and shouldn’t use the database of faces and identities they manage.
  • Q: Are you saying everything is dandy and facial recognition is benign?
    No, not at all. I’m just providing a technical introduction so you’ll be able to recognise the real dangers more easily. It’s a confusing and fast moving world out there, but when it comes to facial recognition, the only companies/entities who are in a position to actually track you are those who own (or has free access to) a database of faces with assigned identities. It’s a short list: Facebook, Google, Apple, TSA, NSA… but will grow longer as companies find ways and reasons for you to share your biometric information with them.
  • Notes: Concert bombing in Manchester, England.
    I have no inside information on this case, but given the fact that I was researching facial recognition at the time a couple of elements of the case stood out for me.
    Quick reminder of the horrific incident: a suicide bomber blew himself up outside an Ariana Grande concert in Manchester last year.
    Several concert goers were killed in the process.
    Within 24 hours the authorities 1)knocked down the apartment door of the bomber and 2)announced that this was not an isolated incident, they expected more to come.
    They were right, but how on earth could they know this?
    The only way to find the apartment of someone who literally blew the evidence up is through ‘forensic surveillance’.
    In short, because almost every square meter in England is under surveillance, authorities were able to ‘rewind the tape’ and track the perpetrator back onto the train, back to his flat and then also back several days to confirm that he met with people who were on some sort of watchlist.
    This does not mean that authorities tracked any people who went to the concert. The CCTV cameras did not match any identities before the incident and no privacy rights were violated, but because everything was captured, they could go back, track the bomber and prevent more crimes from happening.
    If it did indeed happen this way, it’s a responsible use of the technology that keeps us safer without encroaching on our right to privacy or free association. That been said, the same system could obviously be used to harm too and it’s therefore important to have a ‘separation of church and state’ when it comes to these systems. In short: we must keep our eyes on whomever owns that database of faces, for that’s the critical point of failure in all these systems.
  • Q: “Do you provide racial demographics”
    At Fancam we provide sports teams with demographic data on their crowds so that they’re enabled to make better business decisions.
    I get the question above quite often and my answer is always, “no we don’t”.
    I have both ethical and technical reasons for this stance.
    Ethical: I grew up in South Africa during the last years of Apartheid and saw what racial profiling can do. In my opinion the world has done enough experimenting in this regard and now’s as good a time as any to stop.
    Technical: It turns out, race is the most difficult of the demographics to pull from facial analysis. Apparently we’re more alike than we think and race is also more fluid than many would expect ;)
  • Notes: Public profile pictures 
    Hopefully it’s clear by now that facial analysis can only be turned into recognition/tracking with the help of a database of faces assigned to identities. This is true and should give you some level of comfort, but there is an important caveat: if you use your face as your publicly viewable profile image on a platform such as Facebook, a third party could technically scrape this information and create the kind of database required.
    Just something to keep in mind when thinking about these things.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.