IT audits for systems of record data are an annual event at most companies. But auditing artificial intelligence and big data, while ensuring that they are under sufficient security and governance, is still a work in progress.
The good news is that companies already have a number of practices that they can apply to AI and big data. These practices are embodied in IT policies and procedures that can be adapted for both AI and big data. All are extremely helpful at a time when professional audit firms offer limited AI and big data services.
SEE: Future of farming: AI, IoT, drones, and more (free PDF) (TechRepublic)
Here are nine questions and ways that companies can use to self-audit their AI and big data:
1. Do you know where your data is coming from?
Companies acquire their own data from business operations, but they also purchase and use data from outside vendors for AI and analytics. All data from the outside should be evaluated for trustworthiness and quality of data before data is used in AI and analytics. Vetting data from third parties should be part of every RFP.
2. Have you addressed data privacy?
You can have your own data privacy rules and agreements with clients or customers, but these data privacy rights get stretched when they are extended to outside business partners that may not have the same data privacy standards. . In these cases, there should be policies and procedures for data privacy not only in IT, but in corporate legal and compliance departments to ensure that customers/clients whose data could be used, anonymized or shared are aware of that fact.
SEE: Data Privacy Day: 10 experts give advice for protecting your business (TechRepublic)
3. Do you have lockdown procedures?
The Internet of Things and edge computing will increasingly contribute unstructured big data to systems. Because these devices are mobile and distributed, they can be easily lost, compromised or misplaced. At a minimum, IT should have a way of tracking these devices and their usage, and locking them down when they are reported as missing or misplaced.
SEE: 5 data categories to learn for faster cybersecurity responses (TechRepublic)
4. Is all IT aligned with your security settings?
Many edge computing and IoT devices, as well as routers and hubs, arrive with default security settings from their vendors that don’t match corporate security standards. As part of installation procedures, IT should include a step where default security settings are checked and then set to enterprise security settings before they are deployed.
5. How clean is your data?
An appropriate level of data cleaning, which could involve data discards, data normalization, the use of ETL (extract, transform, load) tools, etc., should be in place. This is to ensure that the data that enters your analytics and AI systems is as “clean” and accurate as possible.
SEE: AI can be unintentionally biased: Data cleaning and awareness can help prevent the problem (TechRepublic)
6. How accurate is your AI?
The algorithms and the data that are used in AI systems continuously change so that the assumptions for the AI that are true today may not hold for tomorrow. AI may also incorporate biases that are not immediately detected. Because of this, the process of monitoring and revising AI algorithms, queries and data must be continuous and ongoing. An AI procedure for regularly “tuning” AI data and operations should be in place.
7. Who is authorized to touch your big data and your AI?
All big data repositories and AI and analytics systems should be monitored on a 24/7 basis to ensure that only users who are authorized to use the data and systems are accessing them.
SEE: 4 ways to keep control of your AI data (TechRepublic)
8. Is your AI fulfilling its mission?
Minimally on an annual basis, AI systems should be assessed to confirm that they are meeting the demands and missions of the business. If they aren’t, they should be revised or discarded.
9. Can you failover if AI fails?
If you embed AI operations into business processes, your disaster recovery plan should address a scenario in which these systems become inoperable. If a system experiences downtime, what will you do? Will there be a backup system that quickly comes online? Or a set of manual procedures (and employees who know how to execute them) that can take over until the AI system returns? Can the business defer decisions that the AI makes until systems are back up? The procedures for downtime should be clearly enumerated for both IT and the end business.