In many developing countries, socio economic data is not collected regularly. Owing to the high cost and political factors, even the population census was last conducted over fifteen years ago in Pakistan. Besides, the Information and Communication Technology (ICT) has grown enormously in Pakistan over the last couple of years which in turns make Pakistan having one of the highest tele-density levels in the region. However, despite the high levels of tele-density, Pakistan lack public socio economic datasets which can be utilized together with the real time data trail left behind by mobile users of a particular region to get valuable insights into the socio economic state.
SolutionThe objective of this study was to identify the relation between mobile usage data and the socio economic status by using pre-existing datasets conducted by our telecom partners and the national census organizations. However, this study does not identify relationships at the individual level rather we make inferences about the socio economic indicators of a region covered by a cellular tower. We had access to eleven months of data from a major telecom provider in Pakistan, Mobilink, of prepaid subscribers of district Jhelum in the Punjab province of Pakistan. This data was then used to correlate it with variables in two publicly available socio economic datasets. The first one was Mauza Census of 2008 and the second one was the Population Census of 1998. In addition to the mobile data variables obtained from Mobilink, Phone Prices were also collected from handset names by searching popular retail websites in Pakistan.
1. Users were mapped onto the tower under which their residence is located.
2. Cell site boundaries were estimated using the Voronio algorithm.
3. Census data was mapped onto settlements.
4. Census data that was already mapped onto settlements was mapped onto towers.
5. Correlation was found using statistics once census and mobile data variables were mapped onto towers.