Equifax has transformed from a traditional credit bureau to a company offering government, human resources, talent, property, employment and income and other data services. Equifax's cloud, data and business reinvention started with a 2017 data breach that affected 147 million people in what amounts to a fascinating comeback tale.
The journey isn't yet complete for Equifax, but the company's total addressable market has swelled as it continues to add data services and expands organically and with bolt-on acquisitions.
At Constellation Research's Connected Enterprise 2023, I caught up with BT150 member, Manish Limaye SVP, USIS Chief Architect & Head of Data Engineering Equifax, to talk data architecture and the broader transformation.
Here are the takeaways from our conversation:
Equifax at the intersection of data, analytics and technology. Limaye said Equifax is a "decision intelligence company" that revolves around data. "We help people in pivotal moments of their life like buying a car, mortgage, renting an apartment," said Limaye. "We do so by providing data and insights to businesses in every industry so they can make informed decisions. My role is to connect the dots between technology and data, strategy and execution, and art of the possible and our infrastructure."
Data architecture as basis of transformation. Equifax's security incident gave the company to reimagine and rethink the company for the next 100 years to come, said Limaye. To rethink the business, Equifax needed an architecture that eliminated data silos. He said:
"The ability to bring multiple datasets together seamlessly is an essential element of the architecture. Our transformation not only focused on the run of the mill cloud work, but really reimagining how we will bring the data together in our own data fabric. Prior to the cloud, if we have 10 data sets, they are sitting independently, and we join them together. It's a painful exercise. Now with the data fabric we built, we can tap the data at any point in the journey per the legal, regulatory and contractual obligations and really give deep insight to the customer. These are the business drivers behind every technical decision we make."
Lessons learned from Equifax's data architecture. "The most important thing is the quality of data and availability. If the data quality is not there it won't get the results," explained Limaye. "We're in a unique position in the sense that data quality is inherent to the product that we deliver. We're not a typical company where you have an operational system and then you have the analytical system sitting side by side. Equifax is a layer of the data streaming economy. We are really taking the data and making it available as a product."
Limaye said:
- Data quality is everything.
- Regulatory requirements are critical, but if you're in a non-regulated space you can be more experimental.
- Data pipelines need to be consistent across the data fabric.
- If those data pipelines and quality is there, you can take that information and tailor it to what the company needs.
Data set diversity. Limaye said Equifax absorbs data from thousands of suppliers and it is usually in multiple formats. "We have some industry standards and standardization in some areas and others where it's a commercial agreement," said Limaye. "We have a wide array of diverse data, and you need a comprehensive data strategy to be able to deal with all kinds of data and normalize it."
Evolution of the Equifax data fabric. Limaye said Equifax's data fabric has grown organically, but it could be described as a data lakehouse+.
"You have a variety of layers--governance, observability streaming, virtualization, catalog and other things. We built our data platform on top of Google Cloud technology, and we standardized the pipelines at every step. When the data comes, it goes through the initial cleaning and transformation. It also gets entity resolution and linking. There is no differentiation between the operation and the warehouse. Because on the one hand you are getting all this data, you're cleaning it up, and you're making it available for the product. We built our own data hub where we collect data for every platform, and it brings the operational data for different types of uses. We call it purpose views. You could use it for online transactions. There's also typical data warehousing using Google technologies where you can do analytics and marketing on top of Google Cloud."
The Google Cloud journey. Limaye said the decision to build on Google Cloud was made before he joined Equifax, but the partnership with Google revolves around data, AI and modeling expertise. He said:
"Equifax is deep into statistical modeling. AI, you knew was going to be big. We were one of the leaders in the explainable AI space. When we partnered with Google, it was more than just what I'd call run of the mill cloud transformation. We wanted a deep partnership with Google beyond Google Cloud into data engineering. We partner with them. We learn from them. They learn from us because we have the most complex data. There's also the value and security of the data."
Limaye said that it has worked with Google Cloud to automate data processing as much as possible with integrated business logic that gets it closer for use in a product.
When the business is the data. Limaye said business and technology alignment is easier at Equifax because the business is data. He said:
"We have a very high literacy among business partners. We have the data analytics group that supports the business and enables the business to come and ask questions. That group also enables data scientists and data analysts to help them with the answers. Business generally comes to us, but if we see something we seek their guidance.
We cannot afford to have siloes. We operate in a highly regulated space. We have to operate where business, data and tech people come together."
Security. Limaye said Equifax vowed to be a security leader after its data breach.
"That security commitment really meant rethinking and reimagining how we look at securing our data given the sensitivity of it. We came up with a proper security control framework and paired it with our cloud native capability. There is an ability to destroy and rebuild data at any time. When you pair security frameworks and cloud together you walk away with a very comprehensive security framework. That control framework gets translated into a series of technical requirements. It's a very rigorous process."
Generative AI. Limaye said generative AI will have to utilize Equifax's data and security controls. Equifax has a partnership with Google on large language models, but it remains to be seen how generative AI is leveraged on consumer facing products. Limaye said:
"We need to look at the regulation side of it and other guidance. We need to make sure that data remains private. The data cannot be used for training. I think for internal productivity generative AI is a different question. There's a lot more space there for us to be innovative with internal and developer productivity, but we're going to go with a very methodical approach.