Digital footprints and credit scoring

Tobias Berg, Valentin Burg, Ana Gombović, Manju Puri 24 August 2018



The use of digital footprints has set high expectations for financial inclusion. The World Bank has asked “Can digital footprints lead to Greater Financial Inclusion?” (Kumar and Muhota 2012), while a headline of the Harvard Business Review suggested that “Fintech Companies Could Give Billions of People More Banking Options” (Kendall 2017). Venture capital firms are pouring money into start-ups that aim to evaluate the creditworthiness of unbanked people without historical data on repayment behaviour. This line of thought suggests that digital footprints have the potential to extend financial inclusion to parts of the currently two billion working-age adults worldwide that lack access to services in the formal financial sector. 

However, there is a dark side, too. When digital footprints are used in practice, headlines quickly converge on privacy concerns. “Admiral hikes insurance costs for drivers using Hotmail email addresses”1 was a headline in The Sun this year, highlighting the existence of ‘statistical discrimination’ where decision makers use group statistics as proxies for unobservable, outcome-relevant characteristics (Fang and Moro 2011). Concerns over the use and misuse of digital footprints have been prominently manifested by China’s Social Credit System (Creemers 2018). This dark side view suggests that the use of digital footprints can have a considerable impact on everyday life, with consumers constantly having to consider these footprints which are so far usually left without any further thought. 

Setup and digital footprint variables

In a recent paper (Berg et al. 2018), we consider the use of simple, easily accessible variables from the digital footprint for credit scoring. We access a comprehensive and unique data set covering approximately 250,000 observations from an E-Commerce company located in Germany. Judging the creditworthiness of its customers is important because goods are shipped first and paid later. The variables we cover are summarised in Table 1. We argue that these variables plausibly proxy for income, character, and reputation.  

Table 1 Digital footprint variables

All of these variables stand out in terms of their ease of collection – almost every firm operating in the digital sphere can effortlessly track the digital footprint we use. The processing and interpretation of these variables do not require human ingenuity, they do not require effort on the side of the applicant, and nor do they require the availability of friendship or social network data; simply accessing or registering on the website is adequate. Our data set is complemented by a credit bureau score, allowing us to compare the predictive power of the digital footprint vis-à-vis the credit bureau score.  

Key results

We find that even simple, easily accessible variables from the digital footprint have a predictive power that equals or exceeds the predictive power of traditional credit bureau scores. The result is illustrated in Figure 1. Default rates by decile of the credit bureau score range from 0.40% (lowest decile) to 2.69% (highest decile). The difference in default rates between customers using iOS (Apple) and Android is equivalent to the difference in default rates between a median credit score and the 80th percentile of the traditional credit bureau score. There is also a wide dispersion in default rates between users of a premium internet service (T-online, a service that mainly sells to affluent customers at higher prices but with better service) and users from older, free services such as Hotmail and Yahoo. Behavioural traits, such as whether the customer is coming to the website from a price comparison website or via a search engine ad are also highly predictive of default rates. 

We provide a more formal analysis of the discriminatory power of digital footprint variables by constructing receiver operating characteristics and determining the area under the curve (AUC). The credit score using digital footprint variables yields an AUC of 69.6%, thereby exceeding the discriminatory power of the credit bureau score (an AUC of 68.3%). Combining digital footprint variables with the credit bureau score improves the AUC to 73.6%, suggesting that the digital footprint provides complimentary information to the credit bureau score.  

Figure 1 Default rates for selected digital footprint variables   

The digital footprint is also predictive of default for customers lacking a traditional credit bureau score (‘unscorable customers’). We find that the discriminatory power of the digital footprint for unscorable customers matches the discriminatory power for scorable customers. This result holds in the context of a developed economy, so any extrapolation to a developing market setting is subject to external validity concerns. With this caveat in mind, our findings provide suggestive evidence that digital footprints can have the potential to boost financial inclusion for the two billion adults worldwide that lack access to credit. 


The digital footprint variables we use are simple and easily accessible for any firm conducting business in the digital sphere. The information content of these variables plausibly provides a lower limit on the information content of broader sets of digital footprint variables that may be used in practice. Our results therefore suggest that FinTechs using digital footprints can gain a significant informational advantage vis-à-vis any competitor that does not make use of this information. 

In developing countries, the inability of the unbanked population to participate in financial services is often caused by a lack of information infrastructure, such as credit bureau scores. As digital footprint variables are available also for customers without a credit score, analysing borrowers’ digital behaviour may present an opportunity to boost financial inclusion. Providing access to credit to the unbanked via digital footprints is a rapidly developing field in the FinTech sphere.2   

In the developed world, concerns about privacy and discrimination are likely to dominate the headlines. First, digital footprints might proxy for variables that are legally prohibited to be used for credit scoring (such as race, colour, gender, national origin, or religion). It is also conceivable that incumbent financial institutions, threatened by competitors using digital footprints, might use their well-established access to politicians and regulators to lobby for a restriction of the use of digital footprints on these grounds. Second, we expect privacy concerns to be a key in the debate. The use of digital footprints can have a considerable impact on everyday life, with consumers constantly considering their digital footprints which so far are usually left without any further thought. This argument becomes even more relevant if lenders expand the scope of the digital footprint variables they use – the more of our devices are connected to the internet, the more of our personal communication can be traced online.


Digitisation is one of the major trends of our time, and the availability of digital footprints is likely to shape credit decisions going forward. The use of digital footprints can have profound implications for access to credit for the unbanked, but also for the possibility of statistical discrimination and privacy concerns. It is important that researchers, consumers, firms, and regulators watch the development closely and assess the impact of the use of digital footprints on the economy and society. 


Berg, T, V Burg, A Gombović, and M Puri (2018), “On the Rise of FinTechs – Credit Scoring using Digital Footprints”, NBER Working Paper No. 24551

Creemers, R (2018), “China's Social Credit System: An Evolving Practice of Control”, Working Paper.

Fang, H and A Moro (2011), “Theories of Statistical Discrimination and Affirmative Actions: A Survey”, in Handbook of Social Economics Vol. 1A, North-Holland 

Kendall, J (2017), “Fintech Companies Could Give Billions of People More Banking Options”, Harvard Business Review, 20 January. 

Kumar, K, K. Muhota (2012), “Can digital footprints lead to greater financial inclusion?”, World Bank Report Brief 71304.


[1], accessed 19 July 2018. 

[2] Alibaba/AntFinancial in China, M-Pesa/M-Shwari in Kenya, and Kreditech and LenddoEFL in various emerging markets are examples of FinTechs in this area. Many of the FinTechs in this area leverage digital footprint data over and above the simple footprints we analyse in our research. See also Kendall (2017) for a discussion.    



Topics:  Development Financial regulation and banking

Tags:  digital footprints, credit scoring, financial inclusion

Associate Professor of Finance, Frankfurt School of Finance & Management

Affiliated researcher, Humboldt University Berlin

PhD candidate in Finance, Frankfurt School of Finance & Management

J.B. Fuqua Professor of Finance at the Fuqua School of Business, Duke University


CEPR Policy Research