Description
Unformatted Attachment Preview
Purchase answer to see full attachment
Explanation & Answer
View attached explanation and answer. Let me know if you have any questions.
Information Security Framework for Solving Big Data Privacy Issues
By
Nithish Reddy Arawala
A dissertation submitted in partial fulfillment of the requirements of the degree of
PhD in Information Technology
At the
UNIVERSITY OF THE CUMBERLANDS
2021
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
Approval for Recommendation
This dissertation is approved for recommendation by the faculty and administration of the
University of the Cumberlands.
Dissertation Chair:
__________________________________
Dr. Ray Bynum
Dissertation Evaluators:
__________________________________
Dr. Jim Webb
__________________________________
Dr. Lori Farr
2
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
3
Acknowledgment
I wish to express my sincere gratitude to my supervisors for their guidance and support in carrying
out the dissertation. I extend my regards to all the staff members of the Department PhD-IT for
their moral support and encouragement during the dissertation period.
I thank my classmates, family, and friends for their moral support, financial support, and well
wishes. God bless you all.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
4
Abstract
Big data is unexpectedly changing the face of the global economic system in the twenty-first
century. Most organizations have faced challenges in protecting customers’ intellectual property
and safeguarding their personal information to maintain confidentiality and ensure business
integrity and stability. As a result, security frameworks deployed to solve these data privacy
issues help most institutions safeguard information. Data privacy is a significant concern as most
companies have failed to protect the customer’s confidential information, which is paramount in
financial institutions. Although the big data that companies collect and refine can reveal
extraordinary insights that can give them a competitive edge. A large portion of the data is
personal and, if not used carefully, can lead to serious privacy violation issues. It arises because
the companies that obtain the data mishandle it, and people like hackers and cybercriminals
compromise it. Therefore, if the data is not safeguarded well or used for the intended purpose, it
can land in the hands of people with evil intent and cause significant damage. This research
seeks to provide an information security framework to help organizations utilize big data and
preserve privacy. The research used the qualitative research method. The study population was
financial institutions in San Antonio, Texas. Informational technology experts working in the
institution provide their experience managing big data and privacy. The framework proposed
provides holistic techniques and methods that maintain privacy when handling big data.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
5
Table of Contents
Approval for Recommendation....................................................................................................... 2
Acknowledgment ............................................................................................................................ 3
Abstract ........................................................................................................................................... 4
Chapter One .................................................................................................................................... 9
Introduction ................................................................................................................................. 9
Background and Problem Statement ......................................................................................... 14
Purpose of the Study ................................................................................................................. 24
Research Questions ................................................................................................................... 26
Theoretical Framework ............................................................................................................. 27
Limitations of the Study ............................................................................................................ 39
Assumptions of the Study ......................................................................................................... 40
Definitions ................................................................................................................................. 42
Summary ................................................................................................................................... 47
Chapter Two.................................................................................................................................. 51
Introduction ............................................................................................................................... 51
Structure of Banking Systems ................................................................................................... 53
Big Data..................................................................................................................................... 55
Big Data Analytics in the Financial Sector ............................................................................... 59
Big Data Technologies .............................................................................................................. 70
Data Privacy Issues ................................................................................................................... 77
Cybersecurity ............................................................................................................................ 81
Security Measures for the Internet of Things ............................................................................ 95
Security Frameworks for Data Privacy ................................................................................... 100
Privacy Legal Mechanism ....................................................................................................... 121
Related works .......................................................................................................................... 124
Summary ................................................................................................................................. 126
Chapter Three.............................................................................................................................. 128
Introduction ............................................................................................................................. 128
The Research Paradigm ........................................................................................................... 129
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
6
Research Design ...................................................................................................................... 140
Sampling Procedures ............................................................................................................... 141
Data Collection Sources .......................................................................................................... 144
Ethics ....................................................................................................................................... 145
Data Processing and Analysis ................................................................................................. 151
Summary ................................................................................................................................. 161
Chapter Four ............................................................................................................................... 163
Introduction ............................................................................................................................. 163
Participants and Research Setting ........................................................................................... 164
Analysis of Research Questions .............................................................................................. 165
Supplementary Findings .......................................................................................................... 174
Big Data Management ......................................................................................................... 175
Cyber Risk Assessment and Management ........................................................................... 180
Summary ................................................................................................................................. 185
Chapter Five ................................................................................................................................ 187
Introduction ............................................................................................................................. 187
Practical Assessment of Research Question(s) ....................................................................... 188
Limitation of the Study ........................................................................................................... 205
Implication for Future Study ................................................................................................... 205
Summary ................................................................................................................................. 209
References ................................................................................................................................... 212
Appendices .................................................................................................................................. 240
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
7
List of Tables
Table 1: Themes ........................................................................................................................175
Table 2: Proposed Information security framework ........................................................................201
Table 3 ISO security framework ..................................................................................................240
Table 4 The seven risk model ......................................................................................................241
Table 5 Years of Experience .......................................................................................................242
Table 6 Interviwees’ profile ........................................................................................................242
Table 7 Age of participants .........................................................................................................243
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
8
List of Figures
Figure 1 Scheme for addressing Big data privacy Concerns ............................................................244
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
9
Chapter One
Introduction
As customers across the globe have continued to embrace the digitization of most business
services, it is clear that the customers’ personal data is substantially valuable to different
stakeholders. Monitoring of consumer behavior has become the foundation of many business
advertising and market targeting today. In the last decade, big data has become a more valuable
source of significant insight to businesses than ever before.
Many companies store large amounts of datasets, intending to use them to understand
customer preferences and behavior. Clive Humbly, a British data scientist and mathematician,
described data as the new oil (Hasan et al., 2020). However, just like oil data is valuable, it
cannot significantly benefit without refining. Oil is extracted in crude form and separated into
gas, petrol, diesel, and other forms used to make different products. The same case applies to
data that must be broken down and analyzed before being useful (Hasan et al., 2020).
Big data, by definition, entails large data sets that are growing and consist of structured,
unstructured, and semi-structured data formats (Oussous et al., 2018). IBM defines big data as
“data sets whose size or type is beyond the ability of traditional relational databases to capture,
manage and process the data with low latency. Characteristics of big data include high volume,
high velocity, and wide variety (IBM, 2018).
Big data analytics analyzes and retrieves large data sets that are challenging to handle by old
data processing application software. Big data analytics is the process used to analyze big data.
The primary role of big data analytics is to enable scientists, predictive modelers, and other
analytics experts in entities to make effective business decisions (Oussous et al., 2018). The
application of big data is in many fields, including the government, social media analytics,
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
10
technology, fraud detection, call center analytics, banking, agriculture, marketing, smartphones,
telecom, and healthcare (Sahu, 2018).
Traditional data analytics and processing do not work anymore. Big data analytics can
solve numerous challenges in the real world using predictive analytics instead of typical
maintenance (Wu et al., 2020). For example, UPS is known to handle big deliveries. They can
keep up with using big data to analyze data from thousands of trucks and predict which vehicles
are likely to break down, hence saving costs incurred in maintenance. IBM has also used these
algorithms to predict repairs Boston City needed, reducing costs. Rio used big data to predict and
respond if deadly landslides occurred (Wu et al., 2020).
With big data, there’s the likelihood of opening even confidential data that have always
been around us but were never stored, analyzed, or quantified. It presents future opportunities for
predicting events before they happen, which is essential and valuable (Mendes & Vilela, 2017).
There is a wealth of information flowing, and things work better, and it’s possible to prevent bad
things from happening, hence increasing revenue and, in some circumstances, saving a life (Wu
et al., 2020). Human beings cannot watch and process information effectively; therefore,
developing big data is ideal for guaranteeing the smooth running of events and operations.
Big data is changing data warehousing completely. Data warehousing is the central
storage of unified and sanitized data from single or multiple sources (Mendes & Vilela, 2017).
Data warehouses store present and past data, and the aggressive expansion of data volumes has
led to an increment in costs, raising the question of data warehousing effectiveness and
scalability. These costs include license, hardware, and CPUs.
Many companies require new and modern ways to conform to the latest technology
requirements, hence big data analytics (Wu et al., 2020). There is a definite transition from
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
11
premise warehousing cloud solutions; therefore, there is no need to purchase physical hardware,
while cloud-based data warehouses can offer better solutions. It’s a faster way to process and
analyze data.
Privacy is a human right. Article 12 of the Universal Declaration of Human Rights states,
“No one shall be subjected to arbitrary interference with his privacy, family, home or
correspondence, nor to attacks upon his honor and reputation. Everyone has the right to the
protection of the law against such interference or attacks” (United Nations, 2020, p. 4). The
scope of privacy is broad, and consequently, there are limited areas. Information is one category
of privacy that involves handling and processing personal data. People have a right to determine
how they want their information to be used and communicated to others (Mendes & Vilela,
2017).
With the advent of big data, data expansion has possibilities and usefulness;
however, unwanted privacy violations can occur (Mendes & Vilela, 2017). Financial institutions
have limitations on the extent of data analysis and usage, which prevents them from gaining on
big data analytics. The world market is competitive, and with the eruption of innovations,
businesses need to stay competitive, attract customers, and be sustainable in the long run.
The user interface provides powerful management to a single platform that relies on
customers’ campaigns and uses a single platform that raises multiple IT services demanded by
customers (Mendes & Vilela, 2017). The information will respond to the customer’s key life
events, detect any behavioral changes, and provide maximum security on the data. Financial
crime contributes to societal illness and economic instability. Thus, financial inclusion as
mitigation measures and prevention needs to be a priority.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
12
Using big data improves your customers’ intelligent judgment platform, ensuring your
customers' information protection from information breaching or hacking (Mendes & Vilela,
2017). It is evident that with the introduction of customer DNA, an institution needs to provide a
clear understanding to their clients on how the segment of their data is protected and is free from
individual data breaches and vandalism. Artificial intelligence’s power provides index
recommendations on the best action to delete conversation and customer personal experience.
These offer a real engagement to the customers and create awareness of advocacy and the best
customer experience in storing their information.
Big data analysis will rely on customer DNA services to develop an advanced profile
with a sophisticated standard using CRM with these tactics (Lilley, 2018). It will provide a single
customer view to ensure the populated customers cannot campaign and include structured data
into the internet and external sources (Lilley, 2018). Information is collected and digitized as a
customer organization based on potential analytics, cyber threats, and cyberattacks. As a result,
there is a need for large amounts of consolidated information to protect customer information
from attacks.
Financial institutions need to deploy intellectual protection services to safeguard user
information from breaches and threats that arise with this improved financial institutions’
technology (Moreno et al., 2016). The existing innovation of big data relies on revenue
generation and streams to minimize life-threatening viruses and revolutionize the organization
through the lengthy investment of cybersecurity risks (Moreno et al., 2016). Therefore, financial
action task forces need some implementation application with common compulsory national
lottery regional determination that constitute an offense and constitute the requirement of a
common sanction approach to avoid breaches.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
13
Increasing financial logistics structures to support anti-financial crime organizations’
domestic and multilateral public sectors will mitigate data breaching risks (Tao et al., 2019).
These logistics enhance building a better global framework to fight financial crimes in
businesses and societal imperatives. Advancing in public and private partnerships is another
leading factor in ending these challenges. At the associated level, both financial institutions and
law enforcement agencies must work together to protect the public from harm and crimes (Tao et
al., 2019).
Besides, Cybereason and scale are essential successful tools to secure big data. It is
crucial to develop more strict rules and regulations to secure big data and reach global
considerations that institutions will not disregard without massive financial ramifications (Tao et
al., 2019). There is a need for tool modeling to allow data collection and minimize end-user
disruptions. This solution will provide statistical analysis to machine learning and automatically
adapt to the security environment changes. This solution will provide statistical analysis to Al
and automatically adjust to the security environment changes (Tao et al., 2019).
The economics of privacy encourages a proper collection of processed information and
stores them in a safer place for accessibility by authorized users (Tao et al., 2019). Moreover, the
economic perspective analysis needs to minimize cybersecurity issues to protect user information
and ensure privacy deployment in all the banking and financial institutions (Tao et al., 2019). All
banking institutions need to provide their customers with educational facilities to reduce
financial threats and detect bank account attacks. Utilizing knowledgeable and skilled workers
will likewise decrease the threats of assaults and give the greatest assurance to the client data.
It is imperative to identify the current measures undertaken by a management team to
secure data that need to be implemented and use the latest data security measures to protect user
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
14
information in the bank institution and government sector (Jain et al., 2016). Big data analytics is
a crucial route, and data warehousing is still required in modern techniques to become productive
and efficient. However, it is believed most organizations maintain traditional data warehousing
(Jain et al., 2016).
Background and Problem Statement
The digital divide stands for the difference between regions with access to modern
information and communication technology (Solangi et al., 2018). In this day and age, the new
economy is the latest trend. Information and communication technology have brought many
innovations that have led to many world economies’ growth and development (Tao et al., 2019).
Information communication technology is among the reasons for the rapid economic growth we
are experiencing. However, economic growth must go hand in hand with social and democratic
agendas, especially when dealing with inclusion (Solangi et al., 2018).
There is a disadvantage to people and societies that cannot be part of the modern
economy because everything operates through information communication technology.
Therefore, the digital divide leads to some communities not being part of crucial decisionmaking and proper participation (Solangi et al., 2018). On the other hand, information
communication technology is essential for all regions’ social and economic development,
prioritizing the digital divide gap.
Ending the digital divide depends on people’s exposure through proper education and
training (Solangi et al., 2018). The new technology cannot be helpful if people lack skills and
competence (Tao et al., 2019). The bridge can end if the schools and colleges offer the proper
education and training to ensure the largest population is literate and has acquired the right skills.
The digital divide basis is not on underdevelopment only but also the lack of technical
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
15
knowledge. In some cases, the cost of ICT devices has led to a rise in the digital divide (Solangi
et al., 2018). Types of digital divide include access divide, use divide, and quality of use gap
(Solangi et al., 2018).
The banking sector has changed, and customers prefer using checks instead of cash and
using electronic banking to perform transactions. As a result, banks have created mobile apps
and portals to ensure convenience and improve customer service (Solangi et al., 2018). However,
this transition leads to serious cybersecurity risks. Many banking applications are vulnerable
because of server insecurity, unsafe data storage, possible data leakage, improper encryption, and
inadequate authentication and authorization when logging in.
The technology behind big data is disruptive as the technology has both positive and
negative outcomes. It is great for most customers to offer information to help save their money
or give them better product and service options. Organizations use big data extensively to fuel
the finance sector towards digitization (Jain et al., 2016).
Utilizing big data, companies gain technological, financial, and competitive advantages.
Technical benefits enjoyed by institutions using big data include scalability, accessibility of
accurate data, and integration of structured and unstructured data (Almeida, 2017). The
competitive advantage involves increasing customer satisfaction, insights into consumer
behavior, new products and services, new business models, increased customer loyalty, datadriven marketing, and increased sign-ups (Almeida, 2017). Financial benefits include increasing
sales and sales leads and return on investment (Almeida, 2017).
Data collection attracts potential risks to privacy (Soria-Comas & Domingo-Ferrer,
2015). Financial institutions collect vast amounts of data and become a target of cyber-attacks
based on the sensitivity and values of the information obtained. Compared to other sectors such
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
16
as health, the finance industry has the most data intensity (OECD, 2020). The threats include
“data breaches, internal misuses by the institution’s employees, unwanted secondary use,
changes in companies’ practices, and government access without due legal procedures” (SoriaComas & Domingo-Ferrer, 2015).
Most large financial companies have embraced the technology to implement a digital
transformation, address customer needs and boost profit and loss (Jain et al., 2016). The data is
valuable for the success of the companies, but the question is the influence and implication the
information has on the financial sector. Every financial service has become technologically
innovative, relying heavily on big data.
However, even though the revolution of big data technology has caused in the financial
sector, it has also led to privacy issues that affect the industry. The protection and security of big
data is the most crucial problem that is affecting the finance sector (Azeroual & Fabre, 2021).
Data quality and regulatory requirements have also been considered critical issues with big data.
Although all financial products and services depend on data and produce it each second,
the finance and big data study has not peaked yet (Azeroual & Fabre, 2021). Therefore, future
researchers will need to focus on financial data management to address the technical issues and
help financial companies benefit from big data. The larger the company, the larger the volume of
data; hence more security and protection are required (Tao et al., 2019).
A good example of a financial company that experiences big data privacy issues is
Equifax, a credit reporting company. The company announced that cybercriminals had managed
to steal the personal data of approximately 143 million customers in the United States (Hasan et
al., 2020). Later, the volume of individual information was updated, and the number rose to
147.9 million customers. The stolen data included sensitive information such as dates of birth,
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
17
social security numbers, and domicile addresses. As a result, the company incurred $700 million
in compensation and fines (Hasan et al., 2020).
It indicates that stakes are high for the organizations in the finance sector handling big
data. Even the customers who were not impacted by the data breach directly increased their
attention and vigilance on managing their data, according to a survey conducted by McKinsey on
1000 consumers in the North American market on their views regarding data collection, privacy,
breaches, hacks, communications, and regulations (Hasan et al., 2020). The discovery customers
grew increasingly selective on the types of personal data they shared.
However, the customers indicated a high willingness to share their personal data with the
financial service and health service providers. But despite the desire to share their personal data
with the two types of service providers, none scored a trust rating of above 50% for data privacy
(Hasan et al., 2020). This trust is justifiable depending on the current history of data breaches in
financial organizations like Equifax. In the survey, the researchers from McKinsey discovered
that customers were aware of past data breaches, which significantly influenced their responses
(Hasan et al., 2020).
It is critical to emphasize that privacy is not easy to understand, let alone identify and
evaluate. It means different things to different individuals, such that the amount of information
everyone is ready to disclose varies significantly (Fang & Zhang, 2016). As a result, it becomes
of great interest for individuals to decide the personal data they are releasing to the outside
world.
Furthermore, the consumer data is worth a lot to the financial companies, but it might
benefit them (Swinnen, 2018). It leads to an asymmetric situation because the companies reap
substantial economic benefits from consumer insights. In contrast, consumers benefit a little or
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
18
receive no help at all. The companies can also use the insights to engage in dubious practices
such as providing expensive services.
Big data brings a lot of conveniences to businesses for data-driven decision-making
(Fang & Zhang, 2016). However, many institutions encounter inconveniences with big data. One
of the inconveniences is privacy. In the utilization process, if the methods available do not
adequately offer protection to user data, it threatens privacy. With big data additional to
traditional privacy issues, utilizing personal information to analyze and research invades privacy
(Hasan et al., 2020).
Institutions use mechanisms such as anonymous identifiers to hide identifier information
of their customers when doing analysis; however, this is insufficient as other contents can be
defined accurately by customers (Sei et al., 2019). Institutions lack sufficient methods and
processes hence the low adoption of big data by many institutions (Begenau et al., 2018).
Big data is the most emerging issue in the age of innovation, technology, and the Internet
of Things. It significantly influences business activities and operations; hence, identifying the
effects is crucial for organizations (Hasan et al., 2020). Big data impacts financial markets,
internet finance, credit services, risk analysis, financial management, and fraud detection (Hasan
et al., 2020). Hence, it has become an integral part of the financial sector's innovation and
development.
There are various financial businesses such as Retail banking, online peer-to-peer
lending, SME finance, mobile money transfer and payments platforms, assets management
platforms, and many more (Hasan et al., 2020). These financial entities generate thousands of
data sets daily; hence big data management is part of their operations. Big data facilitates
companies to understand customers’ activities financial markets and make investment decisions
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
19
that would benefit everyone. Due to the number of transactions and data transmitted through
financial institutions, big data attracts much attention in the financial sector (Begenau et al.,
2018).
Financial Institutions utilize multiple data sets every day to make decisions, especially in
trade, risk analysis, and investments (Hasan et al., 2020). Data proliferation is changing the way
businesses handle information. Firms are exploring ways to manage the enormous information
they collect to transform it into valuable insights in the financial industry. The insights will drive
growth and keep companies at a competitive edge (Fang & Zhang, 2016). One of the challenges
hindering the progression and application of big data is privacy protection (Yu, 2016).
Security risk is a concern for financial institutions collecting and distributing data across
networks and systems (Fang & Zhang, 2016).
Securing vast amounts of data from threats is crucial for institutions and analyzes the
systems to detect and prevent potential threats (Diniz et al., 2017). Corporate Finance Solutions
(2020) point out that the financial sector is a data-intensive sector with unlimited opportunities to
leverage data to gain valuable insights to revolutionize finance.
Big data benefits include real-time stock market insights, financial models, customer
analytics, risk management, and fraud detection (Corporate Finance Solutions, 2020). Challenges
facing financial institutions in big data applications include meeting regulatory compliance, data
privacy, and data silos (Corporate Finance Solutions, 2020). Data privacy is associated with data
storage in cloud computing. As a result, firms are concerned about putting sensitive data in the
cloud. Many firms utilize public cloud networks; some have tried private cloud; however, they
are expensive to acquire and maintain (Corporate Finance Solutions, 2020).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
20
Big data deployment is beneficial; however, handling sensitive information across
systems that increase the risk of compromising outweighs any advantages the companies might
acquire (Diniz et al., 2017). Moreover, data compromise damages companies’ reputations and
leads to financial losses. About concerns, financial institutions are not engaging in applying big
data analytics. Fang and Zhang (2016) point out that 62% of banking firms are mindful of
utilizing big data because of privacy issues. The privacy concerns continually restrict institutions
in handling and analyzing customers’ personal data.
High voluminous data from numerous sources pose privacy and security risks (Diniz et
al., 2017). Consumers have concerns about using personal data and how it can be misused or
accessed unlawfully. A survey conducted by McKinsey found that respondents had 44% trust in
financial services with their personal and sensitive data (Anant et al., 2020). The lack of
confidence heightened the numerous data breaches that institutions report. Organizations respond
poorly to data breaches, and there are many exposed records.
Concerns over data privacy push consumers to seek services from other providers
(OECD, 2020). Based on the McKinsey survey, consumers have high trust in an organization
that requests limited personal information and show brisk efforts in acting following breaches in
their systems (Anant et al., 2020).
Further, in instances where a transaction involves money management and is significant,
consumers are ready to share information. For a less critical transaction, people choose to restrict
the type of information they share. This research aims to find ways to reduce the privacy issues
associated with big data by proposing an information security framework that will provide
methods that will guide agencies, including banks that face ransomware, cyber-attacks, and other
threats in the world today. In addition, the research generates customer confidence in all the
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
21
financial institutions by suggesting the best ways to reduce the scandals and pressure on
customer personal data privacy standards. As much as there is an existing digital technology and
big data privacy and security, these measures are inadequate since the system still faces more
threats, particularly during the Covid-19 pandemic (Mills, 2020).
The research problem discussed in the research is the significant application of privacy
issues, which is also associated with security protocol in implementing and handling user
information, as reported by most cyberattacks. As a result, most organizations face challenges in
protecting customers’ intellectual property and safeguarding their personal information to
maintain their confidentiality and ensure business integrity and stability.
An article by the Texas banking association highlights cyberattacks against banks in
Texas increased by 238% in 2020 (Mills, 2020). There is an increase in wired transfer attempts
and ransomware attacks. The rise in attacks attributes to during the Covid-19 pandemic, a lot of
the provider’s attention has shifted, and hackers utilize these opportunities to attempt attacks
(Mills, 2020). The rise in attacks adversely affects the application of big data. The research aims
to identify current issues involving big data management and private providers of frameworks
that will enhance an existing department to minimize vulnerability and protect user information.
Financial institutions face information breaches as inadequate preparation and better
cybersecurity strategies (Security Intelligence Staff, 2019). As a concern, all the institutions need
to implement the best ways to solve the problem, use reliable sources and implement the best
method of improving the internet of things to provide maximum security on user information
(Yang et al., 2019).
Research that needs emphasis is the privacy protection model based on the multiple
levels of trust system that would ensure low priority of data is compromised with a data privacy
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
22
periodic system’s focus. Therefore, the financial institution needs to develop frameworks that
will access all the computing environment’s security levels before acquiring them to enable the
financial institution to evaluate cloud services and choose the best option.
Most financial and government institutions face critical threats to information risk and the
vulnerability of customer information (Swinnen, 2018). Big data provides many benefits that
help enhance investment and budgeting decisions (Huttunen et al., 2019). Big data deals with
volume, variety, velocity, predictive analysis, and user behavior analytics. However, due to the
rapid growth of information technology, the internet, and mobile communication networks,
development has led to problems in speed, structure, volume, cost value, security privacy, among
other things (Tian & Zhao, 2015).
Different methods need to be deployed on the ground to minimize these threats and make
our financial institution safe to secure personal information, like restricting possible loopholes
that might breach the security. The deployed system needs to be strong and concentrate on
preventive measures such as antiviral firewalls and anti-malware applications to ensure
maximum data protection and secure big data analysis.
A report by Carnegie Endowment for International Peace in 2021 highlights
cybersecurity incidents that target financial institutions as more regular, complex, and disastrous.
Entities are losing millions of dollars, reputation, and credibility due to attacks (Carnegie
Endowment for International Peace, 2021). In the United States, the government and various
bodies have issued warnings on impending attacks and threats towards financial institutions.
SEC released a statement warning financial institutions of ransomware attacks targeting
to gain access to systems on July 10, 2020. A report released by IBM trustee researchers
discovered criminals stealing from banks in America and Europe using mobile emulators
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
23
(Carnegie Endowment for International Peace, 2021). Hackers used 20 emulators to spoof
16,000 phones and gain access to usernames and passwords. They entered the emulators and
used them to steal money from mobile accounts and initiate money orders (Carnegie Endowment
for International Peace, 2021). FBI Director Christopher Wray released a report warning
financial institutions of a Third-Party Service attack (Carnegie Endowment for International
Peace, 2021).
Financial institutions are facing multiple different forms of attacks. An SMS phishing
scheme has targeted PayPal user accounts. The system sends messages to users, warning them
they need to verify identities as a limitation is placed on their account. Hackers breached Dave
Banking App based in the United States through third-party services, Waydev, used by the
company and published the personal information of 7.5 million users on the hacker forum RAID.
Finally, a banking Trojan, Zeus Sphinx, surfaced, targeting banks in the United States, Canada,
and Australia (Carnegie Endowment for International Peace, 2021).
There was a cyber-attack compromise of 8261 personal accounts in Tesco Bank. Hackers
performed unauthorized transactions, which about 80% of them the bank stopped from going
through. As a consequence of the attack, customers received distressed text messages about the
breach to their accounts, and others could not make payments using their cards (OECD, 2020).
Cyberattacks are on the rise, and the outcomes are devastating and a concern.
The risk is 300 times higher than other firms (Kuepper, 2020). In addition, the Federal
Reserve Bank of New York points out that the interconnectivity of banks threatens the systems,
and an attack on small banks could initiate a spillover effect to the top banks (Kuepper, 2020).
All these reports and events indicate that cyber risk is a concern for businesses, and there is a
need to manage and reduce the risks.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
24
The legal administrative management to access a control management panel will allow
users to turn on the best method to use and identify any problems. The frameworks provide
additional security control measures to improve and limit data utilization storage, among other
helpful content. All banking institutions must provide their customers with educational facilities
to reduce financial threats and detect any bank account (Tsingou, 2020). Employing brilliant
employees and customer care services will also reduce the risk of attacks and provide maximum
protection to the user information. There is a need to be knowledgeable of the technology used
by hackers. Find ways to ensure information access remarks are there to all the prevention
protocols put in place by the financial institutions.
All financial institutions should move with improved technology and invent new ways to
reduce all threats of securing data in real-time and protecting access control methods of
communication and inscription. It is vital to end script access control methods to data as most
intelligent steps and storage devices are vulnerable. Trust zone-based solution technology
deployed maximum cloud computing protection to protect confidentiality as developed by ARM
(Hasham et al., 2019). Software-based data distribution control ensures maximum security before
sending a message to the users without limiting but providing an extra description to the essential
embedded software and sharing the whole software package deal.
Purpose of the Study
This qualitative study aimed to explore the information security strategies used by
financial institutions to manage big data and maintain privacy and add knowledge by providing a
framework that will protect user information by providing ways of data protection. The chosen
qualitative case study method sought to explore existing strategies to determine the best control
to develop an information security framework to ensure user data is safe. The data collected was
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
25
analyzed and used to build a comprehensive framework that institutions can utilize to maintain
big data privacy.
The study attempted to learn the barriers and challenges financial institutions face
securing personal user data and the measures to secure user data and protect information
vulnerability for financial institutions. A literature gap of a comprehensive framework exists, and
the research provides institutions with a guide to maintain privacy at all points. This study
covered the gap by providing a framework that ensures big data privacy. Suggested methods are
likely to safeguard customer personal information, ensure business entities, maintain the
confidence stability of stored data, and ensure business continuity within institutions (Wagner et
al., 2019).
The finance industry generates massive data from transitions and operations activities
(Pejić Bach et al., 2019). Institutions can exploit data to give new knowledge. Financial
institutions have input from customers’ feedback and activities; however, they benefit from using
big data technologies in a competitive market (Pejić Bach et al., 2019). Analyzing data provides
forecasts that better predict the market position and help boost profitability and competitiveness
(Huttunen et al., 2019).
Many researchers claim big data brings changes in business models in the finance sector
(Almeida, 2017). OECD (2020) points out that financial service providers use it for customer
profiling, risk assessment, account aggregation, and fraud detection with the wealth of data they
obtain. Potential goals in finance are to enhance management and governance. The adoption of
big data has many benefits; however, it poses many challenges, and information security remains
the topmost issue hindering institutions from utilizing these technologies (Almeida, 2017).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
26
The research target population is financial institutions in San Antonio, Texas. The
institutions are facing random cyberattacks, and customer information is at risk. During the
pandemic, the risk and threats of malware against San Antonio banks increased exponentially
(McGlasson, 2017). For this study, the purpose of the population was to provide critical
information on reducing information breaches in the future. On the other hand, all financial
institutions must manage big data successfully and maintain their security and compliance with
the regulatory rules and progression rates.
The research suggests the existing systems in financial institutions are inadequate to
mitigate sophisticated and more frequent attacks. As a result, data privacy is a significant
concern today. Failure to protect intellectual property will lead to more severe issues in the future
that interfere with all financial and government systems (Neville-Rolfe, 2016). The research
proposes strategies in the information framework to protect personal user data against threats and
compromise.
The design of the research included a qualitative case study method. Interviews were
used to explore the various challenges IT experts face today regarding big data privacy and the
strategies used to safeguard user information. Qualitative research helps gain a deeper
understanding of the phenomenon explored through interactions with participants who provide
insights (Etikan & Bala, 2017). It is imperative to note that the adoption of data protection
includes specialized software tools and advanced equipment that will enhance the existing
security measures and improve them to reduce all the vulnerabilities in our financial institution.
Research Questions
1. What barriers and challenges do financial institutions face securing personal user data?
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
27
2. What measures secure user data and protect the vulnerability of information for financial
institutions?
Theoretical Framework
Big data security entails protecting user data when performing data processing, storage,
distribution, and analytics. Addressing privacy issues requires understanding provider perception
and the use of systems in offering protection to user data. The theoretical frameworks for this
study include the Communication Privacy Management (CPM) theory, Information Security
Management System (ISMS), and The National Institute of Standards and Technology (NIST)
Risk Management Framework. These frameworks aim to provide critical controls to include in a
model to create an information security framework. Scholars use the frameworks to understand
security structure and design organizations use for systems.
The CPM theory aims to give a comprehensive outlook on privacy and the components
that influence information privacy. The ISMS and NIST frameworks provide an understanding
of the controls used to enhance information security. These controls strengthen the privacy of
data. In addition, the use of the frameworks is to outline the processes that enhance data security
and privacy.
Communication Privacy Management is a system that “regulates disclosing and
protecting private information when others are involved” (Allen, 2017, p. 1). Privacy refers to
the ability to determine when to disclose information and to what extent. The theory considers
the impact of the disclosure of private information. Furthermore, the theory explains the reasons
people disclose private information. The communication privacy management (CPM) theory
guides an organization in identifying and protecting private data. This theory entails the best way
to reveal this threat and control the power of information beyond the management theory, which
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
28
considers people’s selected information and comes up with criteria to handle all the information
to ensure ownership of data and management.
The communication privacy management theory suggests that firms need resources and
strategies that provide privacy when handling individuals’ personal information. Many
consumers dread the lack of control over personal information (Yuliarti et al., 2018). The
convenience of disclosing personal information is dependent on the assurance that information
will be safe and protected from any threats.
The risk associated with disclosing private information makes people vulnerable to
exploitation (Allen, 2017). CPM suggests boundaries should be placed on information to
differentiate between public and private information. The boundaries also control the
accessibility to information and the expectations for information use and disclosure.
Further, CPM suggests that individuals obtaining private information should develop
methods to protect privacy. Many authors have widely discussed the applicability of CPM to
solve privacy issues in organizations associated with technologies (Allen, 2017; Yuliarti et al.,
2018). CPM guides in privacy management in institutions help reduce theft, unauthorized access,
and malware attacks. The study will apply The CPM theory to understand ways institutions
utilize to protect the privacy of the consumers’ data they collect.
Conversely, communication privacy theory consists of related studies that analyze
category sandwich transcriptions of interviews using the planned behavior theory on social
networks during the big data relying on this series to secure social networks serial (Griffin,
2016). Also, rely on the cloud for big data processing and different outsourcing and
complementation applications with their characteristics.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
29
Communication privacy management is a research theory to understand people’s ways of
showing and hiding private information. The theory recommends that people handle their
communication boundaries with other people based on the possible returns and costs of the data
(Petronio & Child, 2020). Communication privacy management helps explain the privacy
management process (Allen, 2017).
The privacy boundary shows the difference between private and public information.
Communication privacy management principles include “people believe in the right to control
their private information, and people control their private information by following personal
privacy rules. Also, when other people obtain personal private information, they become partners
of the information” (Knight, 2017, p.23). Co-owners of private information require forming
acceptable privacy rules about telling others. The theory elements include:
Private Information
It is the act of hiding and disclosing private information. Sharing private information
includes sharing data with other people but within the limits of the owner of the information.
Boundaries' consideration depends on the people and the decision to share privacy. It is the
process of privacy rule management (Petronio & Child, 2020).
Private Boundaries
Communication privacy management theory is essential to follow the boundary set.
Private boundaries refer to the difference between private and public information (Petronio &
Child, 2020). There is a common boundary when private information is shared. It is known as a
personal boundary when a person chooses not to share their private information. Their
boundaries guard people’s private information. The boundaries keep changing and are easy to
cross and sometimes hard to cross (Petronio & Child, 2020).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
30
Control and Ownership
Communication privacy management theory believes in ownership of information, and
every owner has the right to decide how to share private information (Petronio & Child, 2020).
In other situations, people can share their private information to promote co-ownership. Coownership of information basis is on profound responsibility and understanding of rules for
disclosures. There is a sharing merging to understand that boundaries have enlarged and may not
return to the original position. It is the mandate of the co-owners to be resolute when and how to
share information (Petronio & Child, 2020).
Rule-Based Management System
The rule-based management system helps people manage their information and has three
stages: privacy rule features, boundary coordination, and boundary roughness (Petronio & Child,
2020).
Management Dialectics
Privacy management is the central argument advocating for sharing private information
and the others who oppose it. Final elements are essential because they offer insight into privacy
and its meaning in society. Privacy rules guidance is the process available to share information
depending on the cultural norms and expectations.
Big data privacy refers to big data management to lessen the risk and protect delicate
information. Big data contains massive and sophisticated data sets, and traditional privacy
techniques and processes cannot deal with effectively. When there is a collection of data about
users, it also becomes easier to connect it and form conclusions, behavior, and detailed profiles
of their lives and preferences. Users want to have confidence in the handling of their data.
Consumers need to know how their information is stored, shared parties, and methods of
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
31
complying with the necessary regulations that stand for privacy and data protection (Petronio &
Child, 2020).
CPM theory acknowledges that people believe that they have a right to their private
information and have the power to control the information (Petronio & Child, 2020). The
explanation of ownership by privacy boundaries defines how they protect their data. People give
access to their data to people and organizations that they trust. Big data ensures that their
customers feel secure and in control of their information, and sharing with other parties is with
their consent (Allen, 2017).
Privacy Rule Characteristics
The privacy rules characteristics are in two parts: attributes and development.
Attributes are how people attain rules of privacy and understand the features of the rules (Spicer,
2017). Social interactions are ideal for attributes because there is no need for boundaries for
rules. Every situation has its regulations and ways to manage its privacy (Spicer, 2017).
Communication privacy management includes cultural, gender, context, motivation, and
risk/benefit ratio (Petronio & Child, 2020).
Big data privacy strategy ensures consideration of issues like customer data usage, data
accuracy, and the chances of inconsistencies. Improving data security ensures threats are not
overwhelming, especially data breaches and insider threats. When users want to disclose their
privacy, they conclude the boundary to achieve their goal of revealing and concealing their
private information (Petronio & Child, 2020).
When information owners decide to share or hide their private information, they follow
privacy rules to help manage privacy (Allen, 2017). People use the private rule criteria to push
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
32
privacy rule choices. Privacy rule criteria are helping influence privacy choices, like when a
person is caught unknowingly by someone who reveals private matters (Allen, 2017).
Privacy rules and privacy boundaries become more complicated with many parties
involved. The CPM privacy boundaries basis is on one person and, in other cases, on multiple
privacy boundaries like groups, family co-worker collective boundaries, and social media
boundaries (Allen, 2017). Multiple boundaries involve coordinating issues like privacy
management. People may choose to hold information on their health conditions which does not
always resonate well with those affected (Allen, 2017).
Information security involves the processes and methods applied to achieving
confidentiality, integrity, and availability of information (Al-Dhahr et al., 2017). Information
security is a significant problem for many organizations, and many finances and resources are
fueled to achieve these aims. Consumers’ personal information has four central values:
operational, individual, society, and value to others. People’s data needs handling with care,
respect, and security from any form of possible risk.
There are international standards that govern Information Security. Information security
management system (ISMS) ISO/IEC 27001:2013 is a guideline that describes approaches that
organizations should take to enhance information security and privacy (Al-Dhahr et al., 2017).
These sets of standards help organizations detect and address threats and vulnerabilities against
sensitive information and intellectual properties. The guideline helps organizations shield
themselves from breaches and disruption of their operations (Kurnianto et al., 2018)
For organizations looking to enhance information privacy, the ISO/IEC 27001:2013
offers guidelines on what measures to implement to ensure security standards to preserve
information security and privacy. Also, the guidelines have to assess the business environments
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
33
for potential risks and assist organizations in establishing their information security policies and
procedures (Kurnianto et al., 2018)
The role of ISMS is to limit and prevent data breaches in the organization. It provides the
best guidelines for information security management that maintain a robust system to identify
risks and prevent them. In addition, the ISMS facilitates that organizations protect the
information from any leakages, exposure, damages, or destruction and maintain integrity,
confidentiality, and availability of the data.
Another resource available when utilizing the ISMS system is risk assessment and
management. Companies that implement the ISMS can recognize foreseeable future risks and
implement the appropriate measures to address and resolve them. In addition, controls evaluate
whether a business has met the necessary legislative and regulatory data security mechanism.
The ISMS provides an information security framework with the following major components:
management principles, resources, personnel, and information security personnel. The
framework is subdivided into 11 security areas which require organizations to develop and
implement strategies in each area (Al-Dhahr et al., 2017). (The security areas are illustrated in
table 1.1.)
National Institute of Standards provides a comprehensive, reputable, flexible, and
measurable step to process an organization and manage its information security frameworks. It
provides a link to suit the NIST standards with guidelines to support the implementation of risk
management. It offers federal information security management a requirement to provide all
sorts of assistance and ensure safe and secure information. It also supports implementation, such
as a quick start guide to prepare essential organization management security for private and
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
34
public sectors by categorizing the system information based on transit best approach and impact
analysis (Ross, 2018).
NIST Sets of standards control the system based on assessment with its unique
implementation to manage documents and the transfer of information (Ross, 2018). It creates an
estimate of control over place operations that intend to produce desired results with maximum
protection on their savings and personal information. It also has a continuous monitor control in
implementing risk to the system and ways to safeguard data from third-party access (Ross,
2018).
The risk management framework coordinates security privacy concerns and processes
into the system development life cycle (Ross, 2018). This risk-based approach offers a choice of
details that considers viability because of pertinent laws, mandates, executive orders, principles,
regulations, and policy (Ross, 2018). Through implementing this framework, cyber threats have
no room in the financial sector. Hence, it provides clear guidance for financial institutions that
ensure the measures are appropriate and adequate to communicate, delegate diligence, force, and
are well-supported.
These codes are based on an illustration of points to provide a final section that governs
all the financial systems and tries to incorporate the professionalism needed in all the banking
sectors. Through the distribution of interconnection of nature of information technology service
in the supply chain, risk management is essential to provide the entire lifecycle of a system. It
includes development, design, and distribution, acquisition of supply chain threats, mitigation,
and reduction of vulnerability in all the financial systems to give a product and service at any
stage to enhance the compatibility of the monetary system (Almeida, 2017).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
35
Risk management is another application of big data that banks benefit from by applying
data mining. Using the knowledge gained, institutions can anticipate the preferences and needs
of their customers, product utilization and acceptance, and borrowing and repayments trends. In
addition, financial crises cripple companies leading to bankruptcy and closure. As a result,
understanding finance performance, liquidity, and credit have become important. Analyzing data
help categorize and prioritize risks and helps build models to control risks (Almeida, 2017).
Risk treatment involves reduction, retention, transfer, and avoidance. Methods to achieve
these goals need the following security components: firewall, encryption methods, proxy, digital
signature, and HTTPS server (Drozdova et al., 2020). For instance, account and service hijacking
affects cloud services Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and
Software as a service (SaaS). Banking applications utilize the software as a Service model.
Mitigation techniques against account and service hijacking involve having a formal service
level agreement and security policies, multifactor authentication methods, and monitoring
activities in the cloud for any threats (Amara et al., 2017).
Risk assessments strengthen the organization’s understanding of their risk environment to
identify the most vulnerable areas at high risk. The goal of risk assessment is to resolve better
information-related risk which covers the whole business environment. By managing risks,
institutions can mitigate cyber threats. The National Institute of Standards and Technology
(NIST) Risk Management framework provides guidelines for organizations on performing risk
assessments (Broeders et al., 2017).
In assessing the vulnerabilities to information and the customers to conduct a security
risk analysis, the organization is considered whole (Broeders et al., 2017). Regular monitoring
and review of risk are crucial for the organization due to evolving and new technological
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
36
advancements (Broeders et al., 2017). New technologies with proper monitoring can compromise
sensitive data as vulnerabilities might occur while integrating new processes and techniques.
Further risk assessment requires software to be regularly updated or replaced with more recent
versions. New versions extend the scale of data management and security possibilities.
NIST recommends the risk assessment process should be continuous. Cybersecurity
happens to any company, and the means to avoid any breaches are making a secure and effective
network system (Broeders et al., 2017). Threats and cyber-attacks are evolving with technology,
and organizations need to be capable of combating them. Other risks also surface due to the
immensity of available data from multiple points.
Protecting and Controlling Unclassified Information
NIST private engine provider development of transport information systems is applicable
in principle measures to create a risk model framework through standards and privacy concerns
on civil liberties (Broeders et al., 2017). It offers broad protection to control false information
with a confederal system organization paramount in agencies that directly impact the federal
government. It also assigned a mission of business operation with a suite of guidance to provide
specific obligations with focused protection of confidentiality of user information.
Also, it gives a recommendation of particular security required to achieve an object with
the change of information security for the federal information system modernization act. As a
result, it provides responsible federal agency compliance with a total provision of status policy
established in support of security standards and guidelines developed by NIST (Broeders et al.,
2017). All cyber-security mitigations are through installing firewalls and other security
applications that offer assistance to protect user data through these risk management frameworks.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
37
Improving Cross-Border Domestic Information-Sharing
Fighting financial crime is a global pandemic that affects society nowadays. Reducing
fraud and information breaching instances incorporates new technology to improve border and
domestic information-sharing. Cashless transactions are one of the most effective uses of
technology in addressing the global concern about fraud activities carried out in the financial
system. This framework provided data protection and management of suspicious activities
reported in privacy and bank secrets, inhibiting information-sharing (Pricewaterhouse, 2018).
Also, provide an international level to encourage the continuation of driven global
consideration in improved effectiveness member states and information-sharing regimes. This
international financial institution offered direct government implementation to secure the
exchange of information and expected facts to stop unethical sharing of personal data with third
parties (Pricewaterhouse, 2018). This national commitment provides a complex financial crime
reduction with better global financial management and comprehensive protection through the
inside of cross-border financial crime reduction. This provider standard complies with these
mitigated report points and progress by introducing multinational safety protection units.
Reforming Suspicious Activity Reporting
Multiple authentication processes allow the bank institution to detect suspicious activity
and mitigate its harm immediately. These offer users an opportunity to ensure that bank
transactions are kept safe and personal information is available. It provides limited intelligent
value for poor quality and process of investigation on criminal activities diverted into resources
affecting law enforcement. Banks also improve the feedback loop between financial institutions
to offer regulated sectors key activities that protect customers from fraud (Dobrowolski &
Sułkowski, 2019).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
38
Increased and improved technology makes combating economic challenges easier
through the booster of all technicalities to reduce breaches (Dobrowolski & Sułkowski, 2019). It
includes the use of multiple identifications, firewalls, and end-to-end encryption methods, to
mention a few. This process provides an examination of barriers in adapting to new technology
that assists in expanding risk management toolkits and optimizing adverse outcomes of losses in
bank institutions.
Above all, the mitigation of inconsistency implementation on financial crime compliance
standards offers regulatory clarity. The scope of the regulation supplied financial institutions
with the courage to carry out their services no matter the challenges of information breaching,
hacking, or malware attack. Through individual country culture and political-legal regulation,
frameworks are adhered to win international policy bodies that work best to deal with
commitment and provide overreaching guidance with appropriate national regulation statements
(Dobrowolski & Sułkowski, 2019).
It also provides an essential public sector to define the overseas rule. It empowers
financial institutions to implement policies according to the government’s overall vision and the
purpose of regulatory frameworks. This global fight against financial crimes is paramount in
international and regional cyber-theft breaching and fraud activities. It explores issues with a
tremendous in-depth to provide policy and legal theory of law enforcement authorities with
multiple jurisdictions to gauge the current perspective on financial services and public sectors.
The economic and crime risk management system provides a breakthrough in user data
protection through engagement (Dobrowolski & Sułkowski, 2019). It offers the financial system
the strength they need to overcome all breaches in the market.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
39
Limitations of the Study
The study encountered different challenges of accessing IT information in bank
institutions and understanding critical terms of IT protection regarding big data analytics. The
research met the possible challenges of conducting interviews with limited time available per
session. The interviews were one-on-one; some participants were very committed to work or
occupied and could respond to a few research questions. Collecting the data from multiple
sources and analyzing them to the necessary security requirements is a big challenge as the
trusted institution could not overcome the information lord (Demchenko et al., 2014).
The focus group that is to respond to the question might fail to cooperate fully. Most of
the participants might lack keenness and willingness to provide accurate information about the
challenges they are facing in the company. Ethnography encounters environmental difficulties in
studying the cultural impacts and motivation challenges most companies face (Hennink et al.,
2020). The research requires a lot of time to collect accurate information from the targeted
audience. There is limited access to information security and privacy issues to data applied to big
data applications.
Some technical words not fully explained are due to discussing some exciting solutions
and problems that the research found helpful. Lack of improved internet services will provide
maximum user protection as a challenge in suggesting the best method to protect user
information. An increasing number of people rely on big data analytics to provide solutions that
neglecting security measures was a big problem for implementing new user protection measures
for data privacy. As most people rely on the script method to protect user information, this
service will be overwhelmed by a lack of specialized personnel to provide maximum user
information protection.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
40
As far as security proposals, it is crucial to examine the security of its infrastructure
before giving it 100% trust in user protection. Future trends to improve security and privacy
challenges may encounter barriers as most people will rely on one specific model to provide
security that benefits them with time (Yang et al., 2019). Nevertheless, both users present some
direction that contributes to solving the issue of extensive big data privacy concerns. This
approach will provide an open in the upper table right management system that will process and
improve user share content on social media, which can cause more difficulties in handling the
use of our information.
In analyzing the collected data to the desired data storage output, problems encountered
were during data transformation. Transportation was an issue as most of the collected
information was from different parts. The system made these means of collection take a long
time before finalizing the information. Switching to behavioral-based security policies is quite
expensive. Challenges in meeting up with the research directive who will attend an
environmental analysis and the trusted application were issues in the research proposal.
Assumptions of the Study
Big data refers to a collection of large data sets that are increasing exponentially over
time. The sources of data, driven by the proliferation and widespread of the Internet of Things,
mobile device use, and social media, generate data on a large scale. The study assumes that
financial institutions accumulate substantial data sets from multiple sources and manage big data.
Thus, big data is associated with the financial industry (Hasan et al., 2020).
Financial institutions generate huge data sets daily from money transactions, accounts
modification, and updating each day. Each day, hundreds of millions of transactions occur in the
financial industry, leading to big data (Hasan et al., 2020). Data management and analytics on
financial services and products is an emerging issue for financial Practitioners to consider (Hasan
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
41
et al., 2020). The assumption financial institutions in San Antonio, Texas, are processing the
data to predict customers’ activities and preferences to better services and reduce credit risk.
The most significant concern with big data is privacy (Joshi & Kadhiwala, 2017). Big
data includes the collection of data from multiple sources to extract insights. Data in Financial
institutions are not provided by customers only; transactions generate data, for instance,
purchasing goods online or in return for services and feedback (Soria-Comas & Domingo-Ferrer,
2015). The assumption is that financial institutions use big data and they are processing
information to extract knowledge. Further, privacy protects customers’ personally identified
information against misuse, unauthorized access, and manipulation.
Suspicious activity reporting is one of the challenges that financial institutions and law
enforcement agencies face across this low quality of intelligent services. Investigating criminal
activities affects both big data analysis and bank institutions (Strom, 2016). Additional feedback
has been provided in the system control to reduce the volume of quality of SARs filing.
Different forms of deployment to improve intelligence flow in science, technology, and
financial regimes reduce crimes and increase the volume of intelligent insight and share of
private sectors. The issue of collective up-skill will increase dialogue between regulatory sectors
to act and influence the call. However, it will be too involved in the SAR life cycle to provide the
best results to customers and other banking systems.
Improving feedback by loops will also help mitigate these risks to overcome all the
challenges of communication and enforcement of reporting institutions. Debriefing complex
cases from the SAR system would provide a potential breakthrough to overcome all the issues
reported previously and solve them. Besides, the integrity and reactive security need to provide
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
42
an endpoint to validate and filter all the collective devices and significant challenges facing big
data to steam the weather and data validity from the point of input.
Securing information and protocol approval processes need to be enhanced, regardless of
whether they will recognize any malicious that may influence big data analysis. This platform
will organize the source monitor to provide feedback to oversee the over-analysis of the
necessary education of actual attacks (Parms, 2017). The false alarm speed of a solution to this
problem will entail all the big data analysis stakeholders coming together, identifying all the
threats that might arise in the future, and controlling these issues.
Getting information and convention approval measures is necessary, regardless of
whether they recognize any threats, influencing huge information examination. An answer is sent
to forestall altering or scholarly data to develop the perplexing framework that will oppose the
treating (Parms, 2017). Ongoing security observing will give expected consent in an association
as the primary indication of assault may be distinguished. This stage will arrange the source
screen to provide input to administer the over-examination of the essential training of actual
assaults. The speed of an answer for this issue will involve every one of the information
stakeholders meeting up, recognizing every one of the dangers that may emerge later on, and
controlling these issues.
Definitions
This research entails different keywords that will provide important information on securing user
data today and in the future.
Information systems. “Information systems are the study of networks of hardware and
software used to collect, create and distribute useful data in organizations” (Bourgeois &
Bourgeois, 2014). Information systems contain components that work together to gather,
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
43
process, store and procure information that aids in coordination, decision-making processes, and
visualization in companies (Bourgeois & Bourgeois, 2014). Thus, information systems
concentrate on components and roles to create an information system. Elements of information
systems include hardware, software, telecommunications, and data.
Hardware comprises physical components of the system. They are things that can be
touched and felt. It includes input and output devices that enable computers, smartphones, and
tablets to function (Leek, 2016). Hardware helps humans to interact and utilize technology.
Hardware includes mice, monitors, scanners, hard drives, and keyboards.
The software comprises the intangible part of the information system and includes output,
processing, input, and storage. Application software operates programs that lead to particular
uses in information systems and can either be open source or closed source. Open-source
software is available for the public to use as they wish, especially programmers when the public
cannot use sealed sources (Coronado Mondragon et al., 2015).
Telecommunication systems connect computer networks and enable information sharing
through them. Telecommunications networks also help computers and storage devices obtain
data from the cloud (Bourgeois & Bourgeois, 2014). Telecommunications networks used to
deliver data include fiber optic cables used by cable providers to move data (Bourgeois &
Bourgeois, 2014). A local-area network (LANs) connects computers in a selected space. Widearea networks are a collection of LANs that enable data-sharing across vast areas. Finally, a
virtual private network (VPN) allows users to guard their privacy online through encryption on
public networks (Qu et al., 2018).
Data is “intangible, raw facts that are kept, transmitted, analyzed, and processed by other
components of information systems” (Bourgeois & Bourgeois, 2014). Data storage as numerical
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
44
facts stored in databases or warehouses fits every organization’s needs. Databases hold a
collection of data removable whenever required. Databases enable users to conduct essential
operations like storage and retrieval. Data warehouses keep data from many sources to analyze,
allowing the users to assess the organization (Qu et al., 2018).
Big data. “Big data is a term for large and complex unprocessed data” (Sahu, 2018). Big
data is challenging and complex, making it time-consuming when managed through traditional
methods. Big data features include volume, variety, velocity, variability, veracity, complexity
(Sahu, 2018). Big data use comprises analyzing the data and process to meet the requirements.
The significant role of big data analytics is to facilitate companies to make effective business
decisions by aiding scientists and other analytics experts. Big data application is in many fields,
including the government, web-based media, banking, agriculture, and healthcare (Sahu, 2018).
Personal data. OECD defines personal data as “any information relating to an identified
or identifiable individual (data subject). Any data that is not related to an identified or
identifiable individual is, therefore, “non-personal” data” (OECD, 2020).
Machine learning. Machine learning is an application of artificial intelligence that
teaches machines to automatically learn and, through experience, improve without requiring
programming (Sarker, 2021). Machine learning basis is on creating computer programs that
extract data and learn from it. Machine learning is crucial for cybersecurity. Deep testing can
help identify cyber threats and maneuvers through pattern detection and real-time cyber-crime
(Sarker, 2021). In addition, machine learning can help cybersecurity through prediction,
classification, recommendation, and generative models (Sarker, 2021). For instance, Microsoft
uses machine learning windows to defend advanced threat protection to spot threats.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
45
Data analytics. Data analytics is the process of analyzing raw data to conclude from the
information. Most data analytics functions are automated to mechanical algorithms that go
through information for human use. The application of data analytics in cybersecurity helps to
detect analyses and stop cyber threats internally and externally. Data analytics is crucial in
ensuring there is cybersecurity and that businesses optimize their performances. A cybersecurity
analyst performs data analyses from many sources to get a conclusive report that improves
privacy or security (Jain et al., 2016).
Data Mining. Data mining is a technique that involves obtaining knowledge from vast
volumes of data kept in databases or archives (Taric & Poovammal, 2017). Data mining contains
sensitive information, and parties analyzing data do not want the data identified by the users.
Data mining is performed using machine learning techniques, either supervised or unsupervised
learning algorithms. The standard approaches are clustering, association rule mining, and
classification. The design of the methods is to group data and find relevant relationships (Mendes
& Vilela, 2017).
Sensors technology. Sensors technology is where a machine collects data and
information channelled through the digital world (Patel et al., 2020). A lot of information is
collected that is complex and in large volume, becoming difficult to process through traditional
means. Big data can analyze and process complex and significant volumes of data (Mendes &
Vilela, 2017). The use of sensors has increased over the years, meaning a lot of information and
data are collected, and traditional methods will not work effectively (Patel et al., 2020).
Computer network system. The computer network system involves data storage and
transmission, making it crucial to ensure security (Kurose & Ross, 2021). Computer networks
enable the sharing of network and computing resources. With meaningful data, users can access
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
46
and use data and information found on the network devices. A system makes it possible to share
a file, data, and other information as long as it is authorized (Kurose & Ross, 2021).
Clustering. Clustering is the process of sharing computation tasks within many
computers (Mehmood et al., 2016). The computers involved form a cluster. Cluster computing
performs on distributed systems using networks. Era clustering is crucial in big data because of
the processing speed, cost-effectiveness, scalability, and increased resources (Mehmood et al.,
2016). It also leads to improved performance and availability, making it an essential tool for
global computing.
It is the availability of computer systems resources when needed. These computer
systems resources include data storage and computing power. In this case, there is no direct
active management by the user. It’s explained as data centers present for various users on the
internet. There is a cloud distribution to many places from central servers with global computing.
The essence of cloud computing facilities is to enable multiple users to gain from technologies
without the trouble of profoundly learning about them. Many businesses use cloud computing to
experience a quicker and improved quality of services for their bid data. The technology is
practical, flexible, secure, and efficient and prevents data loss through replication (Solangi et al.,
2018).
Monitoring and Auditing. This crucial part of network security management enables
service providers to enhance marketing information acquired on specific security measures.
Network monitoring only realizes information on the web to systematically review and measure
users’ security policies and the interaction with the network security models (Rupper, 2017).
Besides, periodic monitoring also analyzes interaction detection and prevents a pictorial
application of unnecessary tracking to the complete network security protection of abnormal user
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
47
behavior and unsuspicious data behaviors (Rupper, 2017). This model’s information uses a
systematic review of trust behaviors that detects before reaching a maximum limit.
Similarly, network auditing allows big data to analyze all the audit system challenges and
integrate the availability. Moreover, these techniques achieve a crucial number of replications
that have easy access to the information, minimize data breaching, and overwhelmed updates of
verification on dynamic datasets provide a considerable number of integrated schemes for
simultaneous addiction and authentication and minimize the breaching of information (Rupper,
2017).
Machine Learning. Machine learning requires time to learn, and then the system can
predict when the APT attempts to attack. Machine learning can anticipate the attacker’s plans
and stop them before it happens. The machine learning system can recommend actions to be
taken by security administrators. The company would have prevented the attacker by using
recommendations offered by machine learning systems (Jain et al., 2016).
Summary
Network technology is where there is an exchange of data between information systems
used by organizations. Networking enables users to share digital information through audio,
visual, and data files. Network technology is about data exchange and transmission, and network
management guarantees a smooth flow of communication between the computers and their users.
Network technology management is beneficial in increasing operational efficiency that
leads to effective operating procedures and cost reduction. User experience increases as well as
improved customer satisfaction and feedback. Network technology and cybersecurity are crucial
in every organization, and there are practices to ensure that companies’ networks and
cybersecurity are appropriately managed (Reidenberg & Schaub, 2018).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
48
Network technology is evolving every day, and password management should change to
ensure security enhancements at all times. Unsecured Wi-Fi connections are not suitable for one
to use passwords because hackers take advantage. Organizations should train staff on the
importance of using a password as a security measure and ensure that passwords are changed
regularly to reduce vulnerability and malicious attempts and always log out of the website and
reject requests to remember passwords (Reidenberg & Schaub, 2018).
Even though big data management and privacy are current issues, different frameworks
must enhance the existing security measures and protect user information. Concern for data
privacy is paramount to all banking institutions with the continuous growth of risk of breaching
threats and an increasing explanation of cyber hacking. Furthermore, data privacy is a significant
concern due to most companies’ failures in protecting their clients’ confidential data. This
research entails generating customer confidence in all the financial institutions by suggesting the
best ways to protect the pressure in customer personal data privacy standards.
As much as there is an existing digital technology and big data privacy and security, these
measures are inadequate since the system still faces more threats, especially during the Covid-19
pandemics. An increased number of breaches and threats in most banking institutions is an
excellent example in San Antonio. Most countries are improving how they secure personal
information with improved technology. However, a concern is identifying the current measures
undertaken by a management team to ensure data implementation and the latest data security
measures to protect user information in the bank institution and government sector.
The risk of breaching threatens information security sectors. Thus, an institution needs to
provide a clear understanding of how the segment of their data is protected and is free from
individual data breaches and ensure that artist’ protectors give an opportunity for index
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
49
recommendations on the best action most conversation and customer how to experience. These
provide a real engagement to the customers and create awareness of advocacy and the best
customer experience in storing their information.
Comparing the pharmaceutical industry and finance provides specifications and experts
on cyber criminals to pursue the health industry and suggest the best means progressively.
Cybereason and scale are essential tools that will differentiate features and capabilities to the
targeted specific solution, such as glowing of sensor and running in user-spent operating systems
that allow data collection and minimize end-user disruptions.
The theory of planning to provide privacy on social media and forensic applications is to
identify criminals, data, and communication privacy management. This theory entails the best
way to disclose this threat and control the power of information beyond the management theory.
It provides information and develops criteria that will handle all the information to ensure data
and management ownership.
Most financial institutions have existing information frameworks, but the framework is
being suppressed and is vulnerable to the new threats that emerge with the latest technology. Big
data enables different organizations to collect and make smart choices that drive decisions
positively (Diniz et al., 2017). Protecting user information always maintains sensuality and trust
in an organization.
Customers’ confidence has recently gone down with increased crimes and credit fraud
reported for a while now. Digital privacy is a sensitive topic that entails all the institutions to
secure user data to protect bank accounts. Sharing personal information is a threat to the
customer as it exposes them to today’s existing fraud and theft risks.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
50
Kumar (2017) posits all financial institutions need to utilize big data facilities to secure
cyber theft and improve the intelligence of customer behavior experience. Information needs to
be processed to enhance a clear overview of what is required to protect personal information to
collect first information positively drive decisions. Also, potential loopholes that allow malicious
people to access the bank system need repair to protect user information. In recent years, there
has been an increasing number of cyber-attacks and various institutions, leading to customer
information leakages.
Similarly, the reports of an attack on credit cards are numerous, leading to the customers
losing the cash in bank institutions. It has been a significant loss for both the company and the
customer at large. This study found that continuous reliance on the current digital technology and
big data privacy security system is not 100% secure, leading to frequent attacks.
Chapter 2 of the literature review entails concepts and solutions proposed by different
scholars to ensure confidential information has been collected and stored safely. The following
section, literature review, introduces the idea of big data, big data in the finance industry, the
data privacy problems, common cyber threats and attacks against personal information, and
common security approaches that preserve privacy during big data management. The review
includes research articles describing these components and internet sources providing statistical
data regarding big data privacy issues of financial institutions.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
51
Chapter Two
Introduction
This chapter entails reviewing related literature studies and introducing the concept of big
data and its technologies in banking systems and the limitations affecting data privacy. Besides,
the chapter describes various frameworks and solutions that different scholars have proposed that
financial institutions can apply to guarantee the confidentiality of information collected.
Understanding what big data entails makes it possible to create a framework that surpasses the
vulnerabilities and keeps customers' information private.
The facilitation of business sustainability and continuity can retain customers and attract
new ones to utilize its product and services. For example, financial institutions collect personal
information such as home and work addresses, names, social security numbers, driver's licenses,
and contact information (Copelovitch et al., 2018). The data is necessary while creating bank
records, credit cards, and several databases that enable money transactions. In the I.T. world, big
data is the new trend due to its impact on the business.
Big data enables institutions to utilize the information they collect to make intelligent,
data-driven decisions positively impacting the business. However, privacy concerns big data and
arises from wrong management and security measures while using big data (Xie, 2018).
Customers' confidence lessens on the institution when privacy doubts arise on their ability to
safeguard their data. Besides, on the rise are crimes in credit fraud and identity theft, which
raises concerns about the level of safety of personal data collected by banks (Véliz, 2018).
Privacy in big data is a critical and sensitive concern today. Privacy entails the ability to
safeguard personally identifiable information. Abouelmehdi et al. (2018) highlight that privacy
involves correct user data by making the appropriate decision on where to store what type of
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
52
information, with who and where it goes. Sharing personal information with third parties without
authorization is not allowed. Big data affects privacy during the generation, storage, and
processing phase (Jain et al., 2016).
Kumar (2017) argues financial institutions utilize big data in many ways to enhance
cyber-security while gathering intelligence on customers' behavior. However, they collect vast
amounts of data that generate little or no insights helpful for the business. The challenge of
possessing such big data becomes hectic and keeping all the information safe. Also, there are
many potential loopholes for leaks and malicious people to access the systems (Véliz, 2018).
Besides, despite the regulated collection of this information, it is under threat. During a breach,
violations of a person's privacy and security ensue. In recent years, there have been cases of
cyber-attacks on numerous institutions leading to leakage of customers' information.
Similarly, they have reported credit card attacks, resulting in money loss from accounts
and hackers utilizing them to commit fraud (Srinivas et al., 2019). Besides mobile banking
advancements, the database is growing exponentially, and more personal data is enabled. In their
study, Tao et al. (2019) found that the continuous reliance on traditional digital technology and
big data privacy and security measures is inadequate. The platforms receive sophisticated and
more frequent attacks.
Data privacy is a significant concern today due to many companies' failure to protect the
confidential and sensitive customers' information they collect. The problem of data privacy is
more paramount in financial institutions (Swinnen, 2018). The institutions manage enormous
data from their customers (Swinnen, 2018). The data is continuously growing in volume, and the
risk of breaches and threats increases exponentially. Enterprises are implementing the insights
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
53
generated from big data, but some overlook the privacy problems growing with the continuous
use of analytics and personal information (Sharda et al., 2020).
The vulnerabilities created on the systems arise from the generation of data at many
points at a time. It leads to exposure to data in many ways. Véliz (2018) states that banks are
slow to embrace innovations to refresh their systems. Consequently, with expanded online
protection wrongdoings, individual data is less protected. 62% of banks are careful in using big
data because of the security issues raised (Fang & Zhang, 2016). Obtaining and spreading data
across networks make a security hazard, which is unavoidable. The methodology discussed in
this section recommends a feasible solution for the experience. Hence a framework that grants
robust security to information is necessary.
Structure of Banking Systems
The structure of financial institutions and the level of activities carried out rely on
lawmakers' regulations. There are various types of banking system structures today, and the
mode of operations depends on organizational characteristics and the techniques applied. In
addition, the structure determines the volume of data collected and stored from the services and
transactions carried out. In the United States, the current systems based on organizational
characteristics involve unit banking, chain banking, group banking, and branch banking
(Lessambo, 2020).
Financial institutions are essential to our everyday living. They provide us with services
that support our lifestyle. Services offered by financial institutions include saving accounts,
checking accounts, money market deposits accounts, certificates of deposits, consumer loans,
business loans, electronic funds transfers, automated teller machines, debits cards, online
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
54
banking, mobile apps, and direct deposits of paychecks (Gitman & Zutter, 2018). Further, they
support the economy, development, and growth (Gitman & Zutter, 2018).
The various financial institutions include central banks, internet banks, retail, commercial
banks, saving and loan associations, credit unions, brokerage firms, investment banks,
companies, insurance companies, and mortgage companies (Horton, 2019). These institutions
offer a variety of services. Central banks are irresponsible for monetary policy. The Federal
Reserve Bank is the central bank in the United States. Internet banks offer services through
online platforms that can be affiliated with a bank. They offer deposit and saving accounts, alloy
money payments, and transfer and loans (Horton, 2019).
According to techniques and activities, the structure entails central banks, retail banks,
commercial banks, investment banks, and cooperative banks. In the United States, most banks
are commercial and account for 80% of the total U.S. banking assets (Muraleedharan, 2014).
Over the years, the banking system has revolutionized from the fast growth, innovation, and
development of information technology structures; hence mobile and online banking is the new
financial institution system (Komb et al., 2016).
Internet banking has created a platform where customers have unlimited access to their
accounts and money. Transactions can be carried out anywhere in the world at any time. Besides,
customers can retrieve and display their bank statements after making transactions. It's the new
trend, and many commercial banks are applying it to save customers time from the traditional
branch visits. It is an excellent system for the provision of products and services.
Notwithstanding, it presents numerous expected cyber insecurities, influencing customers'
information security (Komb et al., 2016).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
55
Institutions have utilized the generated vast amounts of customer data and analytics in
capital market trading sectors. A business insurer's primary role is to understand the data by
analyzing raw data and using the insights gained to evaluate the risks. Professionals dealing with
data analysis and visualization depend on analytics and manipulating data to perform their core
role. Hence big data in financial institutions is a dominant factor and has received a lot of
attention (Cavanillas et al., 2016).
Financial institutions are dependent on data and information technology to perform their
operations and run the entities. As a result, they collect sensitive and confidential information
from individuals and involve social security numbers and past and current home addresses. A
survey poll done by Statista estimates that one hundred and fifty-five million people had their
personal information exposed during data breaches in that year (Rose & Johnson, 2020).
Eighty-six percent of data breaches in the year two thousand and fifteen, according to
NetDiligence, personal information was disclosed (Cavanillas et al., 2016). Big data technology
is one of the most promising domains in finance. A survey by TechNavio forecasted that the big
data market would intensively grow by fifty-six percent in the years two thousand and twelve to
two thosuand and sixteen. The main contributors to the growth include advancements in
technology, the need to meet financial obligations, and an advantage over competitors and
regulations (Begenau et al., 2018).
Big Data
The evolution of big data began in 1944 when Fremont Rider predicted that there would
be an information explosion (Marathe, 2016). Following that, there were many discussions about
the information explosion until 1997. The concept of big data emerged. In 2005, Yahoo used
Hadoop to process petabytes of data, after which other companies began to use Hadoop on big
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
56
data. Companies using big data have experienced cost reduction, better decision-making, and the
introduction of new products and services. Most businesses need big data analytics to handle the
large volume of data they hold. Organizations like banks, manufacturing, and healthcare use big
data a lot (Marathe, 2016).
Big data is the collection of a large volume of data that keeps increasing over time. Big
data is a term for "large and complex unprocessed data" (Sahu, 2018).
Big data deals with big data sets that are complex and too big to be handled by traditional data
processing methods. Big data has the following features; volume, variety, velocity, veracity,
exhaustive, relational, scalability, value, and variability. Big data analysis helps businesses in
decision-making and strategic planning. It also helps organizations grow and experience new
opportunities to acquire information about goods and services, consumer preferences, and buyers
and sellers (Solangi et al., 2018).
Big data grouping is structured or unstructured. Structured data comprises information
that the company manages through spreadsheets and databases. Unstructured data is not under
any format or model—for example, data collected from social media. Comments found on social
media networks, websites, applications, and questionnaires generate big data. With the new
technology, data is available from sensors and smart devices. Examples of big data include stock
exchanges, social media, and jet engines (Sahu, 2018).
Big data is complex and sophisticated and requires advanced technologies to handle the
information. The traditional data processing methods cannot process this data as it is
overwhelming, hence the need for powerful algorithms to capture, store, search, analyze, and
visualize information. Big data sources for an organization involve various daily transactions, the
firms' data, social media, public data, and sensor data (Sharda et al., 2020).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
57
Big data has three main characteristics: volume, velocity, variety. An addition of others
explains the complexity of big data. Others entail veracity, value, variability, and visualization
(Furht & Villanustre, 2016; Goswami & Madan, 2017; Oussous et al., 2018).
Volume. Volume in big data consists of large data collected sizes, which have evolved
from terabytes to petabytes. Information creation happens each second, and the capacity to
produce at such rates is through the web of things that interconnect devices around the world.
Plus, information from records and documents is vast and expected to keep filling in volume
yearly. For instance, the data produced in volume each day is around 2.5 quintillion bytes in a
financial institution (Fedak, 2019).
Velocity refers to the data that requires frequent transferring and updates in real-time.
Also, it involves streaming and latency during data handling, which enables fast management of
data. Analysis, processing, and storage at the limit of the collection happen between events.
Commercial banks' current systems allow a threshold of 1000 transactions per minute (Fedak,
2019).
Variety refers to different formats of data arising from multiple sources. The various data
formats are mainly categorized as either structured or unstructured and depend on the collection
source. For instance, data from the internet is primarily unstructured, while information from an
enterprise database is structured. The data formats can be videos, graphics, emails, texts, tables,
spreadsheets, and logs. Data can also be sets that can be public or private, confidential, or shared
information.
Veracity involves that the quality of big data generated is truthful and accurate. The
achievement of benefits while utilizing big data cannot result if the information is inaccurate.
The use of inaccurate data can be disastrous for an entity in systems that analyze data, and
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
58
automatically information is used in decision making. Achievement of veracity involves data
cleaning since there may be accumulated errors in a data stream from multiple sources (Oussous
et al., 2018).
Value refers to the collected benefits obtained when the user's data is processed and
analyzed (Furht & Villanustre, 2016).
Variability in big data refers to a constant change of data (Furht & Villanustre, 2016).
Visualization occurs after data processing, and presentation is in many understandable
and meaningful forms to individuals. Techniques such as graphical sheets and charts represent
the data. Visualization enables an entity to identify trends and patterns of the variables defined
(Furht & Villanustre, 2016).
In their study, Fang and Zhang (2016) found that incorporating big data characteristics
cut across all sectors of the financial institutions. Data volume has been growing immensely in
financial institutions from multiple areas. The structured data acquired from exchanges, bank
vendors, and unstructured data from the news feed, tweets, and social media pages are helpful for
various strategic decisions such as tailoring products according to customers' preferences and
investment decisions.
The use of web applications in mobile banking has increased the rate of transacting
money tremendously. It is easy to buy goods from online retailers, pay for food services, send,
receive money, and pay from transport through mobile devices. Despite the opportunities, the
high velocity accrues a challenge for financial entities to effectively manage and exploit data
(Fang & Zhang, 2016).
Big data becomes valuable to entities through a chain of processes (Cavanillas et al.,
2016; Fedak, 2019; Mehmood et al., 2016). The process is as follows:
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
59
Data acquisition. It is a process that involves collecting, cleaning, and transforming data
before placement in a storage warehouse, where analysis follows after that. In data acquisition,
the infrastructure required to support the process must be advanced and scalable to handle large
volumes of information (Fedak, 2019).
Data Analysis. Analysis entails the transformation of raw data to valuable insights useful
in the decision-making process. This process involves exploration; modeling of data to ensure all
relevant data is highlighted and extracted. The method of data analysis utilizes technology such
as data mining, business intelligence, and machine learning (Cavanillas et al., 2016).
Data Processing. This process involves actively managing data over its life cycle to meet
the required quality for proper usage. It entails classification, selection, preservation, and
validation. As a result, accessibility and quality of data improve, and the data used has met the
guarantee that it is trustworthy and reusable (Mehmood et al., 2016).
Data storage. Management of data requires storing information that authorized
personally can easily access. The traditional relational database management system has been the
primarily stored paradigm used by the institution. However, data has grown in volumes and
complexity and will continually increase; hence, the schema's lack of flexibility and fault
tolerance has made them unsuitable for storing big data. Newer technologies such as NoSQL
have evolved and provided the scalability required in big data (Mehmood et al., 2016).
Data usage. The data can be assessed and utilized in the business to drive decisionmaking. This data's benefits are extensive and offer businesses the capability to be more
successful and sustainable in the market (Cavanillas et al., 2016).
Big Data Analytics in the Financial Sector
Like every other industry, the finance industry has faced transformation through
digitization. Digitization has enabled financial institutions to be more competitive in the
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
60
economic market. Big companies are integrating the new technologies to facilitate digital
transformation, satisfy their customers and increase revenues. However, most companies cannot
use the innovations because their data is not structured (Awotunde et al., 2021).
Advanced persistent threats have high challenging threats to the security and safety of
cyberspace. However, machine learning and data analytics have the potential to prevent
advanced persistent threats. The machine learning system could have detected the signs of an
attack and acted quickly and effectively using data from data analytics systems. For example,
machine learning could detect the advanced persistent threat (APT) attack (Pejić Bach et al.,
2019).
Advanced analytics and big data are serious topics because of their potential to transform
the business world. Big companies worldwide invest heavily in big data and advanced analytics
and have experienced positive results. Advanced analytics involves using new data analysis
methods and changes utilized in business operations. Data analysis algorithm involves various
categories, including linear regression, classification, and regression trees (Pejić Bach et al.,
2019). Implementation of these algorithms ensures refined data possible in business applications,
especially in decision making.
Fang and Zhang (2016) elaborate on the variety of big data in financial institutions.
Information collected can either be structured or unstructured information. Structured data refers
to numeric, organized, formatted data and arises from large data sets such as spreadsheets and
databases. In banks, the data degenerate depending on the systems and technology adopted.
Unstructured data involves information in word form, which is not organized or formatted. Many
data sources such as emails, the internet, social media, and newspapers are unstructured.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
61
Nevertheless, the data is vital as the organizations can gain insights on fraud transactions and
market patterns and trends.
In their article, Davenport and Dyché (2013) highlight the Bank of America and how big
data has helped the institution in growth. The bank has over 50 million customers, including
small business-enterprise; hence, they generate vast data from the transactions, customers, and
unstructured data. The bank has an enormous customer base and, in the early days, had
challenges in processing and analyzing data from the customer set. Hence the emergence of big
data technologies enabled the company to utilize the data and understand the customers.
There are multiple ways banks utilise big data to their advantage in business through
mining and analysis. Increasing success and production factors entails enhancing retail customer
services and improving the systems to detect fraud and operational efficiencies (Bholat, 2015).
Big data is helpful in real-time to find vulnerabilities in the systems across a wide range of
financial instruments. Analytics methods such as predictive analysis can help manage internal
and external risks associated with operational and credit risks (Hasan et al., 2020).
Big data in finance is the structured and unstructured data used to analyze customer
actions and form effective strategies for financial institutions like banks. The finances industry
accumulates a lot of data; hence it requires proper and safe handling (Hasan et al., 2020).
Structured data is information handled in an organization to offer insight during decisionmaking. On the other hand, unstructured data is available in many places, offering critical
analytical opportunities (Hasan et al., 2020).
Financial institutions experience billions of dollars in the global market every day. Data
analysts mandate monitoring data, ensuring their security, and forming predictive strategies. Data
collected in this process is essential, and it highly depends on how well it is collected, processed,
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
62
stored, and analyzed. Traditional data analysis and storage are no longer working; hence,
analysts prefer cloud data solutions. Cloud-based big data solutions have many benefits,
including flexibility, scalability, and integrity (Yu, 2016).
Financial institutions have the power to use big data in creating new sources of revenue,
personalization and customer preference, and good customer service. Predictive analytics is a
part of big data, and it mainly predicts future behavior by using past data. It performs well-using
machine learning technologies, data mining and statistical modeling, and mathematical models to
predict future events (Tyagi, 2020). Predictive analytics consists of a high level of accuracy, and
the past data helps them extract trends that could happen at a specific time and place (Furht &
Villanustre, 2016).
Prescriptive analytics assists companies in gaining their desired results by notifying them
of changes that are about to occur and helping find the best outcomes for the business. It uses
descriptive and predictive analytics and depends on helpful inputs from data monitoring and
hence provides the best solutions for customer satisfaction, company profits, and high efficiency
(Storey & Song, 2017). As a result, organizations that have successfully adopted big data have
enjoyed impressive results (Singla, 2020). The applications are as below:
Real-Time Stock Market Insights
Hussain and Prieto (2016) elaborate on the different sources of the three primary forms of
data: structured, unstructured, and semi-structured, in financial institutions. Sources of structured
data mainly organized and formatted include trading and account systems, price information,
security reference information, external data from market providers, and technical indicators.
Unstructured data for financial institutions arises from daily stock feeds, online feeds, customer
feedback, email, announcements and articles, and blogs. Finally, semi-structured data that entails
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
63
data that does not conform to the usual structure arises from meta-languages. It is mainly XMLbased, such as Financial product Markup Language (FpML) and Interactive Financial
eXchange(IFX)(Hussain & Prieto, 2016).
Big data helps financial institutions analyze more than stock prices and include political
and social trends that implicate the stock market and other financial activities (Hussain & Prieto,
2016). Machine learning has positively impacted trade and investments by monitoring trends in
real-time, straightforward analysis, and data gathering, leading to practical and life-changing
decisions. A computer can make accurate predictions faster than human beings. Analysts can
make better-informed decisions and minimize errors due to human errors and biases. Big data
and algorithmic trading leads to hi-tech ideas and solutions for traders and maximize their returns
(Hussain & Prieto, 2016).
Fraud Detection and Prevention
Together with big data, machine learning has led to an increase in the detection and
prevention of fraud—for example, the security risks involving credit card handling by analytics
and enhanced buying patterns. Banks can counter the thief's actions by freezing the card and
transaction when protected credit card information is taken (IBM Global Business Services,
2017).
Financial organizations use big data to eliminate information asymmetry issues and can
comply with all regulations comfortably. Banks can get data in real-time, which highly reduces
fraudulent activities. For instance, when two transactions happen in one credit card in a short
time, but in different countries, institutions can alert the cardholders on possible security threats
and act quickly before the transactions go through (Yu, 2016).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
64
Financial institutions like insurance can perform detailed due diligence with big data,
minimize fraud, and catch suspicious activities early. For instance, Alibaba created a
management system to detect fraudulent activities through big data processing (IBM Global
Business Services, 2017). They were able to find possible fraud activities through detailed
analysis and investigations.
Customer Analytics
Consumer analytics is about the "analysis of spending patterns, investments, shopping
trends, and other finances" (Storey & Song, 2017, p.51). It assists customers in finding helpful
solutions, and it is the best practical strategy. Financial service providers can monitor customer
behavior and actions to determine the best time to reach out with the services. Big data analytics
identify patterns like demographic patterns and behavior changes. It also helps find spending
risks protecting customers from possible challenges (Storey & Song, 2017).
Customers are essential to businesses, and big data can facilitate methods to offer better
services to them. Customers are the priority of companies, and their needs and preferences are
highly valued and used to plan for the future. Companies can develop better sales leads and use
the new technology to improve product quality and enhance customers' satisfaction. Good
customer relationships with customers and understanding their preferences are a priority, and
organizations can deliver customer-friendly products and meet the actual market needs (Singla,
2020).
Kumar (2017) highlights the generation and application of big data through overseas
transactions by customers in different locations of the world. The company can understand its
customers through the various data generated from unusual activities. With advancements in
technology and digital intelligence, the banking sector has evolved in how it handles
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
65
transactions. The world is a global community, and the interconnection between parts of the
world has drastically changed the bank systems.
Creation of digitized bases in banks for their customers with whom some travel and
transact a lot from different world locations entails using data from traditional and digital
sources. Kumar (2017) highlights that creating an electronic paper trail for customers who
transact from different places requires payment records, money transfers, and credit history to
discover and analyze the data. The information from these sources shows many activities and
patterns that enhance products and services.
Accurate Risk Analysis
Big data enables companies to make better decisions on finances, investments, and loans
through machine learning (Singla, 2020). There is predictive analysis, which contains economy,
capital, and possible business and investment risks. Big data analytics has brought big
opportunities to enhance predictive analysis, analyzing the rates of return and investments. Big
data is valuable and transformational because of the presence of algorithmic understanding
results.
Skyrius et al. (2017) state that banks can utilize big data in many ways. The primary
application is to leverage and exploit the data to gain a competitive advantage over other entities.
The most important use of big data and analytics in financial sectors is risk management, which
affects the business and is detrimental to success and sustainability. Lochy (2019) highlights big
data applications in banks can cut across all operations sections.
Companies using big data analytics help them form risk models and find ways of handling
them. It helps reduce business costs, and organizations can see the risks by analyzing market
trends, social media sentiment analysis, and spending patterns (Storey & Song, 2017). Financial
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
66
institutions can discover and stop fraud by using machine learning algorithms that handle many
existing data. There are ways to find bad creditors from social media. Big data minimizes the
time and resources required to make detailed analyses. Big data assists financial service
providers in analyzing the risks before terrible things happen (Storey & Song, 2017).
High Revenues and Customer Satisfaction
Organizations have applied big data solutions to create analytics systems that foresee
clients' behavior. Client behaviors are crucial to the companies because they can improve their
efficiency and reduce unnecessary delay, translating to customer retention, satisfaction, and high
incomes. Fedak (2019), in his article on big data analytics in the financial sector, highlights that
big data is the most significant primary domain in innovation. In 2016 the investment in big data
analytics was $20.8 billion (Fedak, 2019). Banking customers, while transacting money for
various activities, generate enormous data. Banks can understand their customers and improve
their services to retain customer loyalty through big data analytics (Kumar, 2017).
Quick Processes
Big data has transformed the finance industry with operational efficiency. For example,
big data helps automate all government compliance requirements and share them with customers.
Financial institutions can file for compliance with ease through the relevant agencies. Big data
agencies help ensure good performance in the industry by using performance analytics to figure
out the best performance and make assessments based on data. Big data also helps companies
maintain quarterly and annual targets that help keep efficient work (Sahu, 2018).
Big data enables data integration which helps to increase business needs for change. When
organizations can have complete access to all transactions and other activities, they can identify
the best strategies and ensure they meet their target, including customer satisfaction. Financial
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
67
institutions are embracing the utilization of big data in the decision-making process in
organizations. Using big data in financial institutions includes data growth, increased regulatory
scrutiny on data management, and a changing business model due to the continually evolving
environment influenced by socioeconomic and political factors and advancement in technology
(IBM, 2019).
Enhanced Channels of Purchase
In sales and marketing, big data can enhance the efficiency of marketing acquisition,
activation, and selling up opportunities. Besides, provide cost-effective solutions that the banks
can maintain during advertising and promotions while at the same time attracting customers. In
addition, big data has the potential of managing the various risks in banks and preventing them
from happening. For example, Cyber and credit card fraud, insurance fraud, and liquidity risk are
some of the many challenges facing banks today, and big data can detect and prevent threats
(CyberTalk, 2020).
Big data and better management systems allow services providers to get data from
multiple web applications to ensure the purchase levels increase and better track performance
and customer feedback. Cloud activities help improve customer purchases, enable daily metrics
and performance forecasts, and improve data analysis. Companies can now find customers with
the highest chance of purchasing their products, minimizing the cost of performing generalized
outreach. In addition, analytics has minimized guesswork and uses data integration to combine
market trends (Singla, 2020).
Financial Analysis, Performance, and Control Growth
There are multiple businesses and many responsibilities; hence financial analysis of
performance and growth control between organizations becomes challenging. However,
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
68
companies can report automatically every day with big data and improve productivity
simultaneously (Hussain & Prieto, 2016). Big data aim to instruct, explore, connect and execute.
The center of big data is on gathering information and observing markets. Further, it focuses on
forming strategies and roadmaps that suit company needs and handle challenges (Hussain &
Prieto, 2016).
Big data has led to new opportunities to help financial institutions with advanced
analytics. Growth is noticeable in launching new products and increasing revenues. Big data has
transformed the finance industry and can know customer feedback and needs (Tyagi, 2020).
Companies can identify the critical ways for predictive planning and customer satisfaction
(Tyagi, 2020). Artificial intelligence is rising faster and changing how people interact in the
world. Artificial intelligence will help with client engagement and the transport industry (Tyagi,
2020).
Big data in financial institutions has led to low costs, especially hardware, and enhanced
productivity and efficiency. Financial organizations have improved their operations and have
made better decisions in customer service, fraud prevention, customer targeting, high
performance, and better risk assessments methods (Hussain & Prieto, 2016). Traditional methods
involved humans in charge of analysis and calculated risks and tended, but computers can
perform better and offer accurate results (Hussain & Prieto, 2016).
However, as many are integrating the process, some are still lagging. The primary
concern with big data privacy prevents the large-scale and rapid utilization and adoption of big
data. Besides, an IBM report highlights that most organizations are yet to understand the scope
of big data, and some perform pilot studies for big data. The lag is associated with integrating the
voluminous data generated (Hussain & Prieto, 2016).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
69
The banking sector has changed, and customers prefer using checks instead of cash and
using electronic banking to perform transactions. Banks have moved to create mobile Apps and
portals used to ensure convenience and improve customer service. However, this transition leads
to serious cybersecurity risks (Lessambo, 2020). Many banking applications are vulnerable
because of server insecurity, unsafe data storage, possible data leakage, lack of proper
encryption, and inadequate authentication and authorization when logging in (Fathoni et al.,
2020).
Big data plays a big part in organizations' policies and strategies. Influence strategy is
new, and most industries have begun to experience it. Companies can improve their ways of
making predictions and make them part of the entire strategy. Traditional methods did not
involve leveraging data for financial planning, but now analysts use big data to operate and
ensure business will be good even in the future (Storey & Song, 2017).
Digital transformation
Digital transformation has made a considerable difference in the financial industry, and it
is due to data. Companies can analyze and understand their products and services. Companies
use new digital products or services to ensure customer retention. For example, machine learning
algorithms can trace transactions and determine how many customers make payments through
their bank. Banks, therefore, get an excellent chance to improve their services and create new
services to sell to their customers (Sahu, 2018).
Customer Segmentation
Financial institutions offer customers many services, and big data enables customer
behavior analysis and spending patterns. Customers' experience will improve, and there will be
specialized products through artificial intelligence. The service industry will use big data to
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
70
predict future needs and develop product ideas to meet the demand. In addition, there will be a
socially engaging and digital ecosystem where banks will use social media to interact with
customers to improve relationships (Storey & Song, 2017).
Engaging the workforce
Big data benefits include better workforce engagements. It will improve working
experiences in organizations, but big data implementation has to be adequate to work effectively.
Organizations will be able to track, analyze and share employee performance. Big data helps
financial institutions determine the top performers among the employees (Shamim et al., 2018).
Client data accessibility
Companies can have more information about their clients through big data. Good
customers service means that employees are doing a good job. Big data helps organizations make
performance measures in the projects (Storey & Song, 2017). Big data also creates a safe
environment for users because it can identify fraudulent activities. For example, Alibaba group
developed a fraud risk management system that leverages real-time big data processing (Storey
& Song, 2017). Big data systems analyze consumer data in real-time and detect fraudulent
transactions (Storey & Song, 2017).
Big data in many industries offer many opportunities to make operations better, especially
for banking and financial institutions. It is more effective, has better solutions, and gives the
customer more valuable attention. Many organizations' primary concerns are customer
satisfaction and eliminating risks and fraudulent activities, but big data technologies can now
concentrate on business growth and increasing profits (Shamim et al., 2018).
Big Data Technologies
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
71
The term “Big data” is used to describe vast and complicated data sets that cannot be
analyzed effectively using traditional data-processing software (Singh & El-Kassar, 2019). The
term sometimes describes the field that deals with systematic analysis and too large and
complicated data sets (Singh & El-Kassar, 2019). Many technologies are in use in the area of big
data. The technologies help big data analysts overcome the limitations of traditional software in
big data analytics. Big data technologies are vital in extracting valuable insights from vast and
complicated data sets (Oussous et al., 2018).
Managing big data requires tools and technologies designed to collect and analyze
voluminous data. Traditional methods have limited capabilities and scalability and restrict the
ability to harness data to obtain insights. It is necessary to have the right tools to extract relevant
information to improve operations, sales, and customer experience (Pejić Bach et al., 2019).
Alone infrastructure management and security techniques are insufficient to enhance user data
privacy (Hasan et al., 2020). Rules and regulations enforced by the state and federal government
govern the application of big data. The protection of user data is crucial, and the practices
explicitly limit personal information utilization and obtaining. Big data technology is robust and
reduces the workload faced by organizations.
Big data technology ensures that companies make helpful decisions through data analysis
(Sahu, 2018). Big data technology is also crucial because organizations can compile the data and
find better opportunities. Companies can make good business moves hence becoming more
efficient, increasing revenues, and satisfying customers (Zomaya & Sakr, 2017).
When big data technology is applied as it should, there are benefits like cost reduction. It
is also beneficial because it has memory analytics and can analyze new data sources. Big data
technologies like Hadoop and cloud-based analytics are cost-effective in storing large volumes of
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
72
data and conducting business (O'Driscoll et al., 2013). Companies can analyze data in an
instance and make informed decisions based on their findings. Through analytics, big data
technology has enabled businesses to measure customer preferences, needs, and satisfaction. As
a result, customers' needs are satisfied by creating new products (Sahu, 2018).
Big data technologies consist of data mining, storage, sharing, and visualization; the
extended-term embraces data, a data frame with tools and techniques to explore and transform
data (Tyagi, 2020). Big data is constantly emerging, and new technologies satisfy the increasing
demand in the IT sector. New technologies are there to improve work delivery. Big data
management requires utilizing storage tools, computing tools, and support technologies designed
to handle big data.
The choice of storage tools depends on the size and scale of data and growth expectations
(Almeida, 2017). Processing big data requires supporting technologies that facilitate
communication between devices. Technologies SQL and NoSQL databases allow retrieval of
data and at the same time offer storage. These technologies are suitable for evolving, so they are
appropriate for big data.
`
Big data comes in operational big data technologies and analytical big data technologies.
Operational big data technologies show the amount of data made every day like online
transactions, social media, and data from specific companies used for analysis through big data
technologies-based software. It performs as raw data to help analytical big data technologies.
Companies that use operational big data technologies include Amazon, Flipkart, Wal-Mart,
railways, and booking (Storey & Song, 2017). Analytical Big Data technologies are more
complex than big operational data. It involves data analysis of vast amounts of data essential for
business decision-making, such as stock marketing and time series analysis (Tyagi, 2020).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
73
Many computing and storage tools described in the literature handle big data and enable
extraction and analysis. For example, the NoSQL database and Hadoop MapReduce are popular
computing tools for processing and distributing large data sets. Other tools include columnoriented databases, Scala, Apache Giraph, Cloudera Impala, Spark, R programming, data lakes,
predictive analytics, Apache Spark, predictive Analytics, In-memory Database, Blockchain, and
Tableau. These tools achieve different goals. For instance, Apache Giraph performs graph
processing (Almeida, 2017; Furht & Villanustre, 2016).
One type of big data technology used today is the NoSQL database technology
(Bjeladinovic, 2018). According to Bjeladinovic (2018), NoSQL databases are a variety of nonrelational databases with flexible schemas to help build modern applications. Different NoSQL
databases exist in key-value, document, search, and in-memory databases. The technology is
used in big data to store large, semi-structured, or unstructured data sets. One example of a
NoSQL database in use today is MongoDB.
One benefit of using NoSQL databases in the big data field is their flexibility. The
databases are flexible to handle semi-structured or unstructured data sets. A lot of data in the big
data field is either semi-structured or unstructured. Another benefit of NoSQL databases is that
they are easily scalable. It is also easier to manage big data using NoSQL databases. The
expansion of the databases is straightforward to handle vast amounts of data. The third benefit of
the databases is that they enable high performance. One downside to using NoSQL databases is
that they are vulnerable to attacks. They lack crucial security features like authentication and
authorization features. The databases can be easily compromised.
The second big data technology is Apache Hadoop or simply Hadoop (Azeroual & Fabre,
2021). Hadoop is a software library that provides software utilities to enable the distributed
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
74
processing of large data sets (Azeroual & Fabre, 2021). The technology offers a software
framework to allow the distributed storage and processing of large and complex data sets.
Hadoop provides several crucial benefits in big data analytics. One vital benefit is the ability to
store big data over large clusters of servers. It helps overcome big data storage issues.
Another benefit is that it helps process big data over clusters of servers. Thus, it enhances
the scalability concerning big data storage and processing. The third benefit of the technology is
that it helps ensure high availability. The software framework design detects and addresses
application-layer failures. Thus, ensure the increased availability of big data technology. Hadoop
offers increased security as it automatically encrypts data using robust encryption algorithms
when writing information on the software. The software has helped significantly ease the
management and processing of big data.
The third technology used in big data analytics is Apache Hive (Gunay et al., 2019).
Gunay et al. (2019) describe Hive as a data warehousing technology developed over the Hadoop
software library. Hive provides some important benefits in big data analytics. One of the benefits
is big data querying. The software provides an interface to query warehoused data easily. Data
querying helps analyze big data.
Another benefit of using Hive to analyze big data is that it runs queries fast. Thus, it helps
analyze a lot of data relatively quickly. The third benefit is that multiple users can utilize the
application to analyze. The use by numerous analysts helps quickly analyze huge data sets. Hive
is highly scalable and built over the Hadoop software library, allowing adding more servers to
the Hadoop cluster. Securing Hive is through ensuring Hadoop security and Hive authentication
and authorization configuration. Hive is more secure if there is an investment of enough effort in
securing the technology.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
75
The fourth technology with growing application in big data analytics is the R
programming language (Schmidt et al., 2017). According to Schmidt et al. (2017), data miners
and statisticians have widely adopted the R language for data mining and statistical software
development. The integration of the programming language is with several big data analytics
software like RevoScaleR and Big Data Appliance. R programming is a programming language
and open-source project (Tyagi, 2020). Data miners and statistical experts use it to design
statistical software and data analytics (Furht & Villanustre, 2016). It uses software primarily
used in statistical counting, visualization, and visual studio assistance communication. It has led
to a high-quality language around the world.
Data lakes stand for combined storage to keep a pile of data formats in the form of
structured and unstructured data on any scale. Data collection storage as it is, without changing it
to structured data and releasing many types of data analytics from dashboard and visualization to
big data transformation, real-time analytics, and machine learning (Tyagi, 2020). Companies that
choose data lakes can lead their industry and a new performance of a type of analytics. It assists
an organization in finding suitable opportunities for effective business and growth by attracting
and communicating with customers, maintaining productivity, and making informed decisions
(Zomaya & Sakr, 2017).
One of the benefits that the R language provides to the big data analytics field is
developing statistical software to help better analyze big data. Another benefit that the language
offers is effectively analyzing various data types as the language supports various data types. The
R programming has powerful graphics capabilities. The use of the language in big data analytics
helps create great quality data visualizations.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
76
The in-memory database is kept in the computer's main memory and handled by an inmemory database management system. Traditional databases were stored in a disk drive, which
was complicated and time-consuming. The in-memory database is practical and made to use less
time and minimize the chances of losing data due to process server failure (Storey & Song,
2017).
The fifth technology used for big data analytics is QlikView (García & Harmsen, 2017).
According to García and Harmsen (2017), QlikView is a Business Intelligence software used to
analyze raw data and generate knowledge. The software is capable of data visualization. One
benefit of QlikView in big data analytics is that it helps generate helpful knowledge and insights
from raw data. The knowledge and insights are vital to organizational decision-making. Another
benefit is that the software helps develop visualizations that help better understand patterns
existing in analyzed big data.
QlikView has some crucial security features to help ensure the security of big data and
generated knowledge. Two of the security features included in QlikView are authentication,
authorization, and encryption. A user is required to provide login credentials to access QlikView.
Users can only access features and data that they only have been authorized to access.
Communication between a QlikView Server and a QlikView client is encrypted using a secure
algorithm.
The other big data technology is Apache Spark. Apache spark is the fastest and helps in
significant data transformation through stable streaming and SQL (Furht & Villanustre, 2016).
Hadoop exists because of spark because of its great feature of high data processing speed. It
helps reduce the waiting time required in interrogating and program activities. The spark is also
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
77
used in Hadoop to store and process, making it more efficient than Map Reduce (Zomaya &
Sakr, 2017). It aids languages of big data like java.
Today's many technologies in data analytics are vital in generating knowledge and
valuable insights from vast and complicated data sets. Technologies like NoSQL databases,
Hadoop, Hive, the R language, and QlikView, among others, have helped increasingly ease the
storage, access, analysis, and management of big data. Continued development of existing and
new big data technologies will further improve big data analytics.
Data Privacy Issues
Financial institutions hold a lot of consumer personal information. For instance, a credit
bureau operating in the United States, such as Equifax, contains the personal information of more
than 300 million people. They collect their data from the financial institution to create a credit
score for individuals. Also, banks possess comprehensive files about their customers, including
employment information on lawsuits and claims, bankruptcy liens, eviction filings, and past
addresses (Muraleedharan, 2014). This information is sensitive and confidential and causes a lot
of concern to people regarding security.
They’re still not the best method to protect big data privacy (Soria-Comas & DomingoFerrer, 2015). Traditional methods applied to protect personally identifiable information include
“consent, purpose limitation, necessity and data minimization, transparency and openness,
individual rights, information security, accountability, and data protection by design” (SoriaComas & Domingo-Ferrer, 2015). These methods protect the collected data against unauthorized
access and ensure during processing; data owners know and consent on how they want their data
used.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
78
A poll by Gallup in 2014 found that many consumers lost confidence and trust in
financial institutions after the great recession. However, thirty percent of those polled trust
financial institutions with their personal information more than any other entity (Fleming &
Kampf, 2014). Therefore, banks need to develop systems that focus on providing consistent
services and meeting customers’ expectations.
In a 2017 poll by International Data Corporation, 84% of those polled people are
concerned about staying private personal information (Fedak, 2019). Seventy percent of those
surveyed said they were more concerned with data privacy and were willing to switch banks if
attacks threatened their financial information (Fedak, 2019). Seventy-three percent of those
polled reported that they would immediately change institutions in the case of a breach (Fedak,
2019). Also, in the report, the younger generation is more concerned about their information
privacy than the older population due to the awareness of hackers and cyber insecurities issues.
The biggest challenge currently facing financial sectors is satisfying the regulations
governing the use of personal consumer data. The application enables customers’ personalization
to experience and attain profitability (Hasan et al., 2020). However, data generation at many
points, as is the current trend with many institutions, creates much vulnerability (Véliz, 2018).
With big data, the fold of insecurity and breach to gain access to information increases (Kuepper,
2020).
An organization’s utilization and manipulation can cause a privacy problem (Srinivas et
al., 2019). For instance, after collecting massive data from various sources, an organization uses
statistical analysis and data warehousing to learn and predict customer relationships with the
entity. In this case, users are not unaware, nor is there no authorized permission.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
79
Data privacy is one of the essential issues, and organizations have to improve their due
diligence methods. Information, data privacy, and security concerns are persistent trends every
year (Singh & Singh, 2017). In the beginning, privacy issues were about online trackers who
used pop adverts of things people did not need but now the problems have graduated to political,
ethical scandals. It is part of the cultural issues.
Data privacy should be at the top of the business risk management plan. Privacy issues
increase every year and, in some cases, fail to adhere to regulations and compliance (Singh &
Singh, 2017). Privacy issues are as below:
Embedding Data Privacy
Many organizations cannot have a comprehensive data privacy system and only
concentrate on IT security and disaster recovery plans. The privacy methods are not detailed
enough since data privacy touches on many areas in an organization. Privacy issues have
increased due to companies treating it as an afterthought instead of including data strategy and
staff training issues (Steinmann et al., 2015).
Proliferating Devices
Data privacy is challenging because of additional elements like the internet of things,
phones, and watches. The work environment has more devices hence increasing data privacy
issues. Organizations face challenges in ensuring all devices, operating systems, and apps are
secured. In addition, the organization should ensure there are proper data governance procedures
(Kumar et al., 2018).
High Maintenance Costs
Data privacy procedures are costly; hence, organizations must budget and invest well to
afford the costs and prevent high expenses brought by the data breach. The cost of prevention is
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
80
always lower than the data breach costs. Automation of all processes minimizes data silos,
lowers cost, better governance, controls, reduces human error, and removes manual processing
(Kumar et al., 2018).
Access Control
Data privacy issues cause is bad management in companies. People are vulnerable and
affect privacy and security a lot. When humans are involved, it is challenging to control access to
sensitive data and protect it. Access control issues require effective strategies and steady data
governance systems (Xie, 2018).
Visibility of Data
Organizations cannot secure sensitive data if they are not aware of its location and the
nature of sensitivity. Data privacy requires a thorough understanding and use of methods to find
and group essential data. It is easy to protect and avoid privacy issues with proper data
classification (Steinmann et al., 2015).
Negative Data Culture
Storing a lot of data increases privacy issues. Organizations hoarding data are always at
risk of attacks and breaches. Organizations need to be brilliant at keeping data and ensuring there
is proper collection and storage of data. A good data culture provides minimal data privacy
issues and enhanced data privacy (Soria-Comas & Domingo-Ferrer, 2015).
Big Scale of Data
Many organizations are facing challenges because of the large amount of data they
experience (Yu, 2016). The more data a company accumulates, the higher the issues of
management and protection measures (Yu, 2016). Many organizations cannot hold large-scale
data, and they need to find solutions to ensure their data is protected.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
81
Cybersecurity
Cybersecurity is a subset of information technology. It safeguards computer systems from
destruction and theft to hardware, software, or electronic data—cybersecurity applications in
technology and the Internet of Things. A well-set cybersecurity system consists of numerous
hardware, network, programs, and data (Kuepper, 2020).
The goal is to guarantee that the internet is safe and that cybersecurity safeguards from
assaults like identity theft, total data loss, and attempts for bribes. Cybercrime affects
organizations, and it is considered one of the most significant threats companies experience
(Srinivas et al., 2019). These crimes are growing fast, and they lead to considerable losses to
businesses. Cybercrime has caused millions of dollars to the global economy and disrupted
business operations and essential services (Srinivas et al., 2019).
Cybersecurity programs have increased their scope to ensure most areas are covered.
Still, new technologies like artificial intelligence and quantum computing can change
cybersecurity overall (Srinivas et al., 2019). Quantum computers are more powerful than
computers used today because they have a high level of power that will seriously change
industries like science, medicine, and financial services. Quantum computing poses a massive
threat to cybersecurity because the encryption methods used will be affected, increasing the
chances of cyber-attacks (Seemma et al., 2018).
Due to these changes, companies need to develop improved defense mechanisms under
the law and meet or comply with standards (Seemma et al., 2018). Data security is an ongoing
issue with many threats and changes as technology advances. Internet of Things devices has
continued to increase in the market, increasing the chance of security threats. Businesses and
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
82
consumers are likely to face various cybersecurity issues as new risks emerge as new devices
come up. Corporate networks are threatened (Seemma et al., 2018).
Cybersecurity contains technology and practices that made to safeguard networks,
devices, programs, and data (Seemma et al., 2018). Cybersecurity is also known as information
technology security. Cybersecurity consists of various types, including network security,
application security, information security, operational security, disaster recovery, and business
continuity (Seemma et al., 2018). Cybersecurity faces threats like cybercrime, cyber-attack, and
cyber-terrorism.
Advantages of Implementing Cybersecurity in Financial Institutions
The banking sector has been on the run for over 100 years concerning security issues
(Wendt, 2020). The physical theft of money in the banking sector, including computer frauds and
other cyber threats, has reduced people’s trust in the banking institution. The importance of
cybersecurity is to enhance maximum security on user data and secure transactions to protect
customer assets (Wendt, 2020).
The future project of people going cashless will reduce cyber threats through online
transactions and physical credit scanners that will work best to mitigate cyber threats in the
banking institution (Lopez-Rojas & Barneaud, 2019). This fraud activity affects customers, but it
also impacts the banking institution as a whole since the recovery of the lost data is not an easy
task. As a result, when fraud happens, a bank institution has to pay a hundred thousand dollars to
replace the lost cash and compensate the affected customers with cyber threat incidence (Wendt,
2020).
Protect Bank Institutions from Being Hacked. Cybersecurity systems offer detailed
digital protection to the organization. Employees can use the computer and internet without
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
83
facing any threats. Most banking institutions have engaged in training sessions to allow clients to
keep themselves safe while using transactions and protect them by end-user encryption to stop
criminals from using their personal information. Cybersecurity works best to reduce all portals of
data breaches that serve customers and the bank itself from incurring losses. It also brings back
the trust people need in their financial safety to secure them and offer them maximum income
protection for future generations (Lopez-Rojas & Barneaud, 2019).
Data breaches are among the leading causes of insecurity on people’s money, thus
robbing them of their trust in all financial institutions (Seemma et al., 2018). By imposing a
cybersecurity framework, bank institutions will have a solid point to argue that the customers’
money is safe in their hands. It will provide an upfront of existing security parameters to protect
crucial data of bank institutions, keep records, and maintain standards of providing clients with
personal information safety that they need as the first important aspects of banking.
Reduce Financial Losses. Cybersecurity works best to improve customers' confidence in
the banking sector (Prem Khatri, 2019). It also reduces the financial losses to customers and the
bank itself of failure on customer cash. This general way to enforce the trust that customers need
in all financial institutions’ cybersecurity works best is to incorporate all the possible security
frameworks that protect user data (Prem Khatri, 2019).
Besides, security audits software is available to reveal the strength of existing setup in
banks to create a new way of protecting user data from threats (Prem Khatri, 2019). It provides
recommendations to help individuals save money and have them for a proper investment through
the security of all the transactions and keep paying customer data safe. Also, firewalls configure
applications required in hardware to block attacks and reduce information breaching with an
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
84
update on malicious bank activities to eliminate all the fraud activities that have affected banking
institutions for a long time.
Notably, banks use the multifactor authentication method to provide extreme critical
protection on mobile and online apps that customers utilize for banking services (Carter &
Zheng, 2015). This multifactor authentication protects user data by notifying them when a leisure
activity is happening and giving feedback to the bank institution to stop the incident
immediately. It is a tremendous advantage in barking institutions to save user data and save it
from losing money to compensate customers whose money was looted from the bank due to
cyber theft (Carter & Zheng, 2015). This digital platform provides a six-digit code to send to
mobile customer numbers to accept that they are the actual individual transacting, hence
overcoming all sorts of theft and malicious access, among others.
For decades now, leading banks in the United States have been affected by the losses due
to breached information. Cybersecurity is the only solution to offer a permanent and long-lasting
solution to customer data protection to curb this incidence. It reduces fraud access of information
by third-party and gains back the confidence that banks had with the clients back then (Seemma
et al., 2018).
Protect User Information. Personal information is valuable, especially in the era of
information technology (Wendt, 2020). Employees and consumer protection are essential
because cybercriminals use the information to commit fraud and other crimes with stolen data.
As information security is a crucial part of national security, bank institutions rely on
cybersecurity to protect user information. This electric protection provides all sorts of protection
of user data to reduce losses with automatic log out on financial website applications allowing
only one to stay logged when they have permission (Wendt, 2020).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
85
It provides limited access to clients’ information by third parties, requiring them to enter
logical credentials and authorization to avoid access to the data (Wendt, 2020). This accessible
record of automatic logout minimizes and closes user access after a few minutes to prevent any
instances of information breaching. Also, bank institutions have educated their clients on ways to
clear out the information online if they need to be safe while using mobile online apps (Wendt,
2020). This application aims at solving all sorts of activities that subjects use to cyber-theft
efficiently.
Also, some Bank applications are fake, and when one uses them, subject them directly to
cybercriminals affecting their information and even looting money into their accounts. Through
education, clients know of the consequences of vulnerability. The solution is a need to change
the habit for fear of losing their investment. Cybersecurity measures provide a direct
responsibility of the team to make sure that customers’ information is kept safe and their
accounts are not terminated or engaged with unauthorized individuals (Lopez-Rojas & Barneaud,
2019). Financial transaction layup is a stepping stone of modern society. They offer maximum
protection to user information transactions and record every activity on the account for
accountability (Lopez-Rojas & Barneaud, 2019).
Bank institutions have incorporated all the security measures and transformed them into
online transactions since they are the safest way to protect their clients' user information, data
money, and account records (Seemma et al., 2018). Also, through possession of all the firewalls,
automatic logout multiple-factor authentication anti-malware applications work best to reduce
instances of bank losses. While firewalls upgrade increase protection, it will not stop attackers
from performing the activity unless the malware application is updated. As a concern, bank
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
86
institutions need to update their system regularly to clear out any information online and avoid
all instances that can lead to user data access by third-party (Seemma et al., 2018).
Protects websites and prevents spyware: Cybersecurity systems offer security to
employees who constantly face the risk of possible attacks (Kuepper, 2020). If hackers attack
employees’ computers, it could lead to low productivity and be expensive. A business website is
essential, and if attacked, a shut down can ensue, and money and customers are lost. A reputation
is important; hence customers need to feel safe and be part of a secure website. Spyware is a
cyber-infection made to spy on computers and offer information to a criminal. Cybersecurity
systems prevent spyware and protect employees' activities (Kuepper, 2020).
Cybersecurity Challenges
It is challenging to configure firewalls hence making users not use systems efficiently
(Jain et al., 2016). Constant upgrades are required to ensure time-consuming protection.
Cybersecurity systems are expensive, and some people cannot afford them. However,
organizations can ensure that they overcome the demerits by using the correct configuration.
Proper planning is also essential to ensure organizations have an adequate budget for such
upgrades (Jain et al., 2016).
Cybersecurity helps ensure protection against ransomware, malicious software that
infects the computer, and shows messages demanding money to be paid for the system to work
again (Seemma et al., 2018). Firewall laws are challenging to configure, and users cannot use the
systems with the wrong firewall rules. Cybersecurity systems are slow and can cause users not to
work efficiently (Seemma et al., 2018). Identity protection is essential because cybercriminals
use personal or financial information to commit fraud or authorize transactions or purchases.
Identity theft happens in various ways, and the victims face loss of their finances, credit, and
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
87
reputation. Identity protection ensures that unauthorized people do not use credit reports,
financial activity, and social security numbers.
Cybersecurity is essential for every organization because data breaches and cyber-attacks
lead to huge damages (Rose & Johnson, 2020). Business cyber-attacks lead to class-action
lawsuits, regulatory fines, and a negative image to the public. For instance, Yahoo Company paid
thirty-five million dollars due to the failure of reporting a data breach where hackers took
personal information from million user accounts (Rose & Johnson, 2020).
For instance, Poland airlines faced cyber-attacks in 2015 in their planning computers
leading to flight cancellations and delays (Rose & Johnson, 2020). The threat came from
distributed denial of service (DDoS) attacks. The airlines’ computer systems were overloaded
and could not function normally. Istanbul airports were also attacked where the attackers closed
their passport control systems (Rose & Johnson, 2020). Passengers were delayed and spent hours
waiting for flights following the cancellation. The airline restored the systems, but the damage
was extensive. To reduce such attacks, airports need to implement cybersecurity systems that
offer protection. Organizations can enhance cybersecurity by creating awareness of the existing
dangers and preparing strategic plans to ensure that every system is protected (Rose & Johnson,
2020).
The exploitation of the Internet of Things happens because of software and hardware
weaknesses in banking organizations (Habibzadeh et al., 2019). These challenges are present
because of many factors, including employees who could use devices exposed to cyber-attacks.
Cybercriminals use home routers, printers, and IP cameras to attack financial institutions because
the hackers target those devices and use them to access other information, including customer
information (Habibzadeh et al., 2019).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
88
Cyber-attacks occur in banks with negligence; hence, attackers use phishing attacks to
retrieve credentials and private data. Banks also face scams like a money mule scheme where the
criminals recruit individuals and make them send and receive money, and in return, pay a small
amount of money. Some customers receive messages indicating that their account\s are blocked,
and they need to offer credentials such as logins (Wu et al., 2020).
Cyberwarfare attacks a country through digital attacks like viruses and hacking to cause
destruction, damage, and death. Cyberwarfare consists of governments or international
organizations that attack other countries’ computers and information networks. Cyberwarfare
targets infrastructure through the use of spies, hackers, and digital weapons (Habibzadeh et al.,
2019).
The impact of cyber warfare can lead to battles, computer systems attacks, airports,
power grids (Auxier et al., 2019). The target can also be the financial sector where bank accounts
are affected, stock markets, and digital currencies. The attacks could also target transport like
trains and buses, leading to business and other operations not functioning as expected (Auxier et
al., 2019).
The intruders use hackers to access drone feeds and access information they collect and
use to their advantage (Auxier et al., 2019). With new technology such as the internet of things
devices, the intruders hack and access valuable information. Cyber Weapons also include fake
digital fingerprints that we make of artificial intelligence. Artificial intelligence engines can fool
fingerprints scanners and smart devices. Hackers use deep phony technology to trick people into
believing something unreal (Kuepper, 2020). Deep fake technology uses powerful techniques
from machine learning and artificial intelligence to develop visuals and audio that hackers use
for deceit (Wu et al., 2020).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
89
Defense against cyber warfare can happen through strengthening cyber systems and
social engineering (Auxier et al., 2019). The hackers and intruders take advantage of weak
computer systems and human limitations to access passwords and other forms of authentications.
Network protection requires strong defense and the use of strong passwords and authentication
protocols. Another way of security is ensuring that sensitive systems are isolated from other
systems. The use of features like fingerprints, retinal scans, and face scans is way better than
using regular passwords to protect against cyber warfare. Blockchain is believed to be a defense
against attacks and ensures system security (Auxier et al., 2019).
Cyber-attacks are powerful because data now controls the world. These kinds of attacks
continue being sophisticated with the use of new technology. In the future, cyber warfare will be
against human beings and artificial intelligence (Seemma et al., 2018). Artificial intelligence can
analyze and attack secure systems quicker than human beings, causing severe destruction and
damage than ever before.
Cybercrime affects organizations, and it is considered one of the most significant threats
companies experience (Chang, 2013). There is increased professional training in cybersecurity as
everything revolves around the Internet of Things and computing. The emerging trends in
cybersecurity involve the sophistication of the attacks and innovation to enhance the systems to
counteract them, the breaches, and quantum computing. These crimes are growing fast, leading
to considerable losses to businesses. Cybercrime has caused millions of dollars to the global
economy and disrupted business operations and essential services (Chang, 2013).
Cybercriminals have become common and attack more frequently than before. They
have come up with ways to improve their profiles, and nations are becoming involved (Seemma
et al., 2018). Countries like China and Russia have had many cyber-attacks, which have resulted
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
90
in cybersecurity not being guaranteed. Many companies have invested heavily in cybersecurity,
but it does not offer perfection (Seemma et al., 2018).
Even the most famous companies that have made considerable cybersecurity investments
are always at high risk of cyber threats. Companies that have failed to ensure their cybersecurity
have lost millions of money and led to multiple people being affected directly or indirectly.
Organizations must follow the correct policies and procedures to ensure they can defend
themselves from attacks effectively. In recent years there has been an increase in awareness of
how essential cyber is to organizations (Wu et al., 2020).
Cybersecurity is continuing to appear in more industries, and there has been an increase in
companies’ levels of attention and readiness. However, it differs depending on which industry a
company operates. Financial institutions are some of the sectors that follow strict regulations,
hence constantly updating cybersecurity compliance (Wu et al., 2020). Many companies that did
not take cybersecurity seriously have to comply due to increased data attacks and severe fines
faced after the breach. Cyber-attack sophistication has increased due to ease in communication
and the availability of information shared among attackers (Wu et al., 2020).
Cyber breaches are increasingly becoming inevitable, and it is every company’s
responsibility to ensure that they adequately prepare for attacks (Ebem et al., 2017).
Organizations are now more than ever providing routine checks on external firewalls and internal
firewalls to determine and handle the possible impact of cyber breaches. Companies to remain
vigilant require both internal employees and external experts to check their firewalls for any
potential attacks frequently (Ebem et al., 2017).
Cyber-attacks have evolved over the years, and banks are among the most affected
institutions (Ebem et al., 2017). There is always a new variant of malware. Different
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
91
cybercriminals emerge and attack unsuspecting people, and prevalent attacks are prevalent and
considered dangerous because they are most successful, leaving banks and other institutions
vulnerable. The attackers use sophisticated methods of coming up with compelling emails that
appear legitimate, and bank staff and customers end up giving up their credentials. Various
scholars have identified the banking sector’s threats and how they utilize the information
accessed to commit crimes compromising the customers’ private and confidential information
(Ebem et al., 2017). The risks and threats towards banking systems include:
Identity theft. Banks across the globe lose over fifteen million because of identity theft
(IBM, 2019). Millions of customers have become victims of this type of threat (Burnes et al.,
2020). Identity theft involves the unauthorized utilization of people’s credit information without
their permission to borrow money and buy items. When identity theft occurs, in some scenarios,
the holder of information sells the customers’ information to dark webs, and other cybercriminals
use it to breach customers’ financials and accounts (Burnes et al., 2020).
Cybercriminals buy the information that consists of private credentials, account
information, usernames, and passwords and use them to get into the bank accounts and credit
information (Ebem et al., 2017). The banks can manage identity theft more alert to notice signs
of fraud and identity theft. Banks should encourage their customers to continually go through
their credit and account statements to ensure they can account for all transactions (Kahn &
Liñares-Zegarra, 2015).
Malware. Malware compromises users’ computers and smartphones. In that case, they
become dangerous because each time the user connects to the internet, sensitive data could go
through and eventually attack the bank networks, which compromises customers’ information
and accounts (Grosse et al., 2017).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
92
Financial malware like Zeus is a destructive Trojan, infecting Windows users and pulling
private information from the infected computers (Etaher et al., 2015). Zeus has been used to
access bank customers’ confidential information and make unauthorized money transfers. Zeus,
also known as Zbot, has infected millions of people and affected thousands of accounts and
businesses (Etaher et al., 2015). Another widespread financial malware is Zeus Game Over,
which depends on peer-to-peer botnet infrastructure. Cybercriminals have used Zeus Game Over
to gather finances, and information is marking credit card passwords and other necessary
privatization (Hutchings & Clayton, 2017).
The malware has affected over one million people around the globe. Spy eye is another
dangerous malware that steals data and, by design, takes customers’ money from online bank
accounts (Ronen et al., 2018). This vicious malware can steal bank documentation, social
security numbers, and financial information, enabling authorization to clear out bank accounts
(Ronen et al., 2018). Lastly, crypto-malware is one of the most destructive and dangerous ever to
be made. It steals money from unsuspecting users by accessing their private data and encrypting
users’ information, and there is no way of decrypting. This malware leads to users’ personal
information exposure, and they lose data with no hope of ever recovering them. A Crypto locker
is a Trojan infected users’ systems through email attachments from legitimate companies (Ronen
et al., 2018).
Credential stuffing. It is a form of cyber-attack where customers’ private data is
marked. The hackers steal customers’ credentials and use them to access customers’ accounts,
after which they use them to attempt accessing relevant information (Rees-Pullman, 2020).
Credential stuffing is a form of brute force attack (Mueller, 2020).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
93
There is the processing of multiple credentials entered on a website until they match an
existing account (Rees-Pullman, 2020). Hackers then use the accounts information for their
desired purpose. Attackers used the acquired credential to log into accounts, drain money, and
steal credit card numbers, committing further crimes. Hackers breach databases and takeover
accounts through this form of web injection (Rees-Pullman, 2020). Entities and customers suffer
devastating impacts due to this form of breach.
In 2014, JPMorgan Chase, one of the largest investment banking companies in the United
States, suffered a breach (Mueller, 2020). Attackers used credential stuffing to obtain accounts,
passwords, emails, and addresses of people who had enrolled from the charitable races held by
the organizations (Mueller, 2020). There was a compromise of 76 million households' data and 7
million small businesses in the breach (OECD, 2020).
Phishing attacks. It is a popular cyber-attack where users’ stolen data, including credit
card numbers and login details (Cui et al., 2017). Bank employees are marked and tricked into
opening suspicious links, which eventually leads to malware that freezes the machine, and then a
ransomware attack occurs. The adverse attack has negative outcomes for banks because the
attackers can access private information like customer accounts and financial information, which
leads to the loss of a large amount of money and damages the institutions’ reputation (Cui et al.,
2017).
Chiew et al. (2018) highlight that a phishing attack is a slow process, and criminals can
spy on their target for days and weeks before they attack. For example, the criminals attacked
Bangladesh’s central bank account in the federal reserve bank of New York. They monitored for
a couple of weeks and ended up stealing over a hundred million.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
94
However, there are ways of preventing such attacks, including information technology
personnel increasing protection for customers’ data and coming up with high-tech safeguards
(Peng et al., 2018). The IT departments should always work with the assumption that the bank is
under attack, enabling them to create secure systems that minimize the extent of attacks (Peng et
al., 2018). The banking industry is susceptible, and it bears the responsibility to its customers
hence the need to formulate effective security policies (Chiew et al., 2018).
Ransomware. A common type of malware works by encrypting data, making it difficult
for users to access their data except to pay a demanded amount of money to the cybercriminals.
Cybercriminals use ransomware, especially banks because they know banking institutions fear
losing sensitive data and may be willing to pay ransom fees. This attack happens especially to
small financial institutions because of the lack of proper IT solutions, lack of updated security
technology, and lack of enough endpoint protections (Mohurle & Patil, 2017).
Dyre wolf. It’s a new type of malware that attacks the banking system. The malware uses
phishing techniques to bypass two-factor authentication and antivirus software. In 2015 the
malware was used to steal over one million from businesses (IBM, 2019). IBM security warned
institutions to be careful because the attackers were unusually sophisticated and equipped with
back-end systems and online banking (IBM, 2019). The attackers call innocent users and
persuade them to share their credentials, facilitating the ensuing wire transfers. Dyre was
dangerous because they knew how the banking system works and used clever ways to wire
money through many channels before reaching the intended destination. The bank’s attack would
not trace the funds, hence losing all the money stolen (Kuhn et al., 2016).
Companies are now required to allocate reasonable budgets for information technology
security and defense because it’s essential to determine its success and future. The cyber breach
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
95
has profound negative implications; hence, all organizations comply with relevant policies and
standards and constant audits. Many companies realize that cybersecurity cannot be perfect, but
the main focus is to ensure that they work around the clock to minimize the risk and impact of
attacks (Jang-Jaccard & Nepal, 2014). With modern technology, they have to develop new
cybersecurity structures (Tul, 2019).
Security Measures for the Internet of Things
The exploitation of the internet of things happens because of software and hardware
weaknesses in banking organizations (Ometov et al., 2019). These challenges are present
because of many factors, including employees using devices exposed to cyber-attacks. Home
routers, printers, and IP cameras are devices used to attack financial institutions because the
hackers target those devices and use them to access other information, including customer
information (Ometov et al., 2019).
Ursnif banking Trojan in Japan is an excellent example of a malware attack. In March
2019, Japanese banks received attacks through a Trojan, and the hackers stole clients’ credentials
from emails (Ometov et al., 2019). The execution of the attack entails sending phishing emails to
unsuspecting people. They receive excel spreadsheet attachments. When the individuals
downloaded it, it led to severe damages and the loss of millions of dollars. The Japanese banks’
controls could have implemented content security gateways, endpoint security solutions, social
engineering, and spear-phishing awareness (Ometov et al., 2019).
Brazil, like Japan, also faced malware in 2018, where android users became victims of a
cyber-attack (Ometov et al., 2019). Mobile banking customers were affected by unaware
downloading malware and ended up losing private and essential data. The Trojan was able to
access bank accounts and check balances, after which they transferred money. They lack security
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
96
controls such as brand monitoring of application stores and end-user security alertness (Ometov
et al., 2019).
In 2015, there were reports of a malware attack in Europe where many ATMs received
the attacks, and the hackers took away millions (Kuhn et al., 2016). The first case was a Tyupkin
malware involving installation in windows ATMs manufactured by the same manufacturer,
enabling the criminals to withdraw money. The ATMs lacked proper security models, which
helped reduce the exposure to malicious people (Kuhn et al., 2016). With more research, there
are better ways to detect and prevent Dyre malware, starting with the personnel’s education on
how these attacks occur. Banking officials should be skilled and equipped to identify threats and
stop them before any damages and losses (Grosse et al., 2017).
Bank criminals have moved from physical robbery to cybercriminals, which means they
have the knowledge and skills of how banking systems work (Sajjad et al., 2019). Therefore,
banking management should take serious action on the security and safety guidelines and
policies. The banking industry needs to move to new technology trends to ensure its customers
the best services. It comes with increased risks, especially for the banking industry, because they
are involved with money. Banks have come up with mobile applications that are very attractive
to customers for convenience (Sajjad et al., 2019).
When the bank systems are not secure, cybercriminals take advantage and steal money
and data easily (Ometov et al., 2019). Therefore, it is crucial always to update operating systems
since cybercriminals can quickly attack old operating systems. Smartphone users should avoid
using public Wi-Fi because they become vulnerable to hackers since public Wi-Fi may lack
proper security measures. All banks should always use two-factor authentication, which aids in
making sure legitimate users are doing the transactions (Ometov et al., 2019).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
97
Customers and other users should always disconnect or log out from applications they are
not using to reduce cybercriminals’ chances of attack (Wazid et al., 2019). Active notification
messages are also essential to ensure that the users are aware of all transactions taking place.
Bank data storage should always occur in secure areas, including passwords and other private
information (Wazid et al., 2019).
The world economy now depends on technology because it enables people and
organizations to become more mobile and share data easily (Lee et al., 2019). Banking is one of
the industries that has transformed and created systems that work well, but hackers also have
improved their skills to match the advanced technology. Cybercriminals work day and night to
look for loopholes with the new technology. The hackers use sophisticated technology, which
makes it hard for banks and other institutions to defend them (Lee et al., 2019).
Hacking in some banks occurs with a notice; however, they only realize when it’s too
late. Organizations need to invest in new infrastructure because old and outdated systems are
more vulnerable to cyber-attacks (Seemma et al., 2018). The systems must be strong to make
sure problems are solved as soon as they occur. They should also concentrate on preventive
measures such as firewalls, antiviral, and anti-malware applications.
Artificial intelligence is another way to prevent attacks because artificial intelligence
helps make the authentication process more reliable and secure, such as fingerprints. Artificial
intelligence and machine learning work well to avoid threats of phishing, which are very
common in banking. Artificial intelligence and machine learning are practical because they can
detect risks and end them before they cause severe damage (Khalaf et al., 2019).
Cybersecurity is a significant issue because organizations view cyber-attack as an IT
department issue only (Lee et al., 2019). For banks to prevent and neutralize attacks, there should
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
98
be the inclusion of personnel. There should be training and education is crucial for all staff and
customers. When everyone is aware of the threats, it becomes easier to detect and prevent cyberattack.
Human beings are vulnerable, so spear-phishing attack is more common. The attackers
have brilliant ways to make employees and customers offer sensitive information. For example,
in 2018, United Kingdom companies were attacked by phishing attacks (Lee et al., 2019).
Cyber-attacks occur in banks with negligence; hence, attackers use phishing attacks to
retrieve credentials and private data (Khalaf et al., 2019). Banks also face scams like money
mule schemes where the criminals recruit individuals and make them send and receive money,
and in return, there are payments of small amounts of money. Some customers receive messages
indicating that their accounts are blocked, and they need to offer credentials such as logins
(Khalaf et al., 2019).
With machine learning, customer experience is optimal. Retailers can use predictive
analytics to figure out future results from past customer service interactions (Auxier et al., 2019).
It can perform in real-time to ensure all details entered handling is adequate to reduce price
stickiness. Price stickiness moves in one direction and hardly in the opposite direction. Price
stickiness can be present in areas with long-term contracts (Auxier et al., 2019).
Bank data storage should be secure, including passwords and other private information.
Sticky prices are when market prices fail to change fast compared to external changes that imply
price change should happen. Machine learning uses in customer service; it offers more
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
99
convenience to customers and efficiency to other stakeholders (Khalaf et al., 2019). With
machine learning, customer experience is optimal.
Machine learning helps retailers develop a viable price strategy and reduce sticky prices
(Subroto & Apriyana, 2019). Machine learning algorithms can weigh main pricing variables into
account and come up with an automatic pricing strategy. Machine learning also helps retailers
gather online data to show detailed pricing findings, reducing the stickiness of price changes. It
also helps change distribution methods and reduces the adverse and hazardous effects (Subroto
& Apriyana, 2019).
The algorithm constantly goes through websites and collects prices even from
competitors for the same products to enable retailers to make informed decisions—machine
learning aids in increasing revenue and reducing costs (Auxier et al., 2019). Through detailed
discussions, retailers may develop ways to improve checks like absolute and relative checks, and
if they notice something is wrong, they can handle it accordingly. Having a good strategy makes
it possible to discover human errors before they become hard to change or manage (Auxier et al.,
2019).
The world economy now depends on technology because it enables people and
organizations to become more mobile and easily share data. Banking is one of the industries that
has transformed and created systems that work well, but hackers also have improved their skills
to match the advanced technology (Bélanger et al., 2017). Cybercriminals work day and night to
look for loopholes with the new technology. The hackers use sophisticated technology, making it
hard for banks and other institutions to defend them. Banks face attacks without notice, and they
only realize when it’s too late. Organizations need to invest in new infrastructure because old and
outdated systems are more vulnerable to cyber-attacks.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
100
The systems must be strong to make sure problems are solved as soon as they occur.
They should also concentrate on preventive measures such as firewalls antiviral and antimalware applications. Artificial intelligence is another way to prevent attacks because artificial
intelligence helps make the authentication process more robust and secure (for example,
fingerprints) (Subroto & Apriyana, 2019). In addition, artificial intelligence and machine
learning work well to prevent attacks such as phishing threats which are very common in
banking. Artificial intelligence and machine learning are practical because they can detect threats
and end them before they cause severe damage.
A good security policy ensures that the banks’ information technology team has the right
frameworks, making it easier to safeguard them. IT staff work well with proper guidelines and
policies and ensure there are mal risks and exposure. When a good security policy is in place,
enforcement and implementation protect customers’ data and bank networks. With the right
team, they can continuously check the network and other banking systems for any irregularities
or unfamiliar changes and compare with the laid-out policies to ensure total compliance
(Bélanger et al., 2017; Cui et al., 2017).
Security Frameworks for Data Privacy
Today, banks’ most significant challenge in effective and efficient data management
methods is keeping up with technological advancement and innovation (Swinnen, 2018).
Adopting new technology poses benefits and risks for the banks and the customers (Lee et al.,
2019). When adopted, hackers benefit from new technologies, and the full scope is not
understood. Data privacy refers to who can access consumer information and its purpose
(Swinnen, 2018). Hence, institutions require implementing measures in their systems and policy
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
101
to govern the employees’ activities and handle customers’ personal information in achieving data
privacy.
The solution for safe banking systems begins by acknowledging that the industry
comprises many users, making it attractive for cybercriminals (Habibzadeh et al., 2019). The
first important step is to ensure users follow strict security measures and guidelines for safe
passwords. Banks should also have practical technical support to ensure the systems are secure
against cyber-attacks. They can use quantum-resistant cryptography and security features like Go
silent, which protect against cyber-attacks from various sources (Habibzadeh et al., 2019). Go
silent is a security feature that closes down access to banking networks by protecting servers,
mobile devices, printers, scanners, laptops, and desktops.
Banks should also ensure cloud security review regularly to ensure it is operating
effectively. Employees should access if they need it to ensure less exposure to security threats
and vulnerabilities to protect customers. If a security issue occurs, the organization should have a
recovery plan to help minimize huge losses and downtime after an attack (Habibzadeh et al.,
2019).
There are many levels to solve data privacy in banks. A research study by Jain et al.
(2016) suggests a framework involving security measures application during data generation,
storage, and processing. Similarly, Liu (2015) proposed privacy protection with big data, which
entails implementing security measures and technologies during the data storage process.
Data generation
It entails two phases. One phase involves the data owner providing information for the
third party. In contrast, the other step involves various activities such as browsing the internet of
things a person submits personal information (Jain et al., 2016). In protecting the information’s
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
102
violation, entities must build a system that can perform an access restriction or falsify data.
Access restriction entails the system not allowing users to gain specific information if the data
owner has regulated it. In other instances, the data owner can provide partial access, and in such
cases, an entity can use encryption tools or anti-tracking extensions (Sangeetha & Sudha, 2019).
Falsifying data applies to the customers who mask real information while online shopping
activities (Jain et al., 2016). Applicable tools are software such as Mask Me or Socket puppet
Data storage
Liu (2015) highlights that privacy violations in an institution arise from the storage
process of information. Cloud technology is helpful to the banking world as it provides
flexibility and a fast way to provide services to clients efficiently. However, cloud computing is
subject to failure, and to ensure secure storage systems, institutions should combine
technologies, processes, and policies to improve security (Mazumdar et al., 2019). The storage of
data in organizations has been made easier through the innovation of big data technologies.
In storing big data, it is essential to consider the three main characteristics of big data.
Therefore, a storage system should be scalable and configured dynamically (Siddiqa et al.,
2017). A compromise to the company’s storage system is detrimental as it compromises
customers’ personal information; hence the software used should receive data sets from multiple
sources and withstand threats (Liu, 2015). There are various types of cloud computing
infrastructure. IT has the mandate to select the best one based on its workload, operations,
security, and cost (Joshi, 2016).
Cloud providers
Storage is crucial because all data collected before and after processing and analyzing
requires storage. Big data is an effective way, but there is a need for proper planning and
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
103
preparation to make the process succeed. Big data can be an excellent way to ensure complex
data storage with cloud and premium hardware. Important information is about the management
and storing of data (Drozdova et al., 2020).
It is possible to hold an unlimited volume of data regardless of the data models and
ranges. Big data consists of acquiring, analyzing, curation, storage, and data usage (Solangi et
al., 2018). Data storage in big data has the potential of changing industries and the world in
general. However, security and privacy require further research as technology advances (Solangi
et al., 2018).
Cloud computing is an integral part of businesses and requires robust security strategies
to be secure against cyber-attacks. In cloud computing, data stored contains sensitive information
from multiple users. Data in the cloud is susceptible to leakage or theft (Singh & Singh, 2017).
Leakage and unauthorized access to data are detrimental and unacceptable. Hence, various
mechanisms and security standards exist to secure cloud computing systems.
Cloud is essential to organizations because it ensures data is safe and secure, but this
sometimes becomes an issue because of so much data storage on the cloud (Labes et al., 2016). It
makes it vulnerable to attacks, especially financial companies. Cloud providers are not
responsible for any attacks; hence, banks should be aware of their obligation to protect
customers’ data and information saved on the cloud (Labes et al., 2016).
Cloud service providers are responsible for data security (Drozdova et al., 2020). Security
controls apply during data creation, transfer, storage, use, destruction, and recovery (Drozdova et
al., 2020). Privacy in the cloud is through data utilization, minimization, user data limitation, and
accessibility (Singh & Singh, 2017).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
104
Cloud computing is an integral part of big data storage and is essential for banks. There is
a security challenge in the cloud environment, which affects the privacy and integrity of
information. Gwara et al. (2016) highlight that many vulnerabilities jeopardize data stored in the
cloud computing environment. The challenges are mainly in access control management and
cloud infrastructure.
The big data association of cloud computing and big data entails numerous advantages
that provide services with time-shared resources and widespread access to deliver this
mechanism. It involves a wide usage of cloud computing security technology that multiplies
principles and networking to resource sharing, visualized database, and operations system (Singh
& Chatterjee, 2017).
The main issue affecting cloud computing security is the secured storage of data and a
secure way of sharing the data in cloud platforms at the network level (Drozdova et al., 2020).
This protocol ensures that intermodal communication and distribution are not built on a tent
cable level to provide easy access to information by administering user data. The information is
encrypted to reduce any threats, with all the data labels providing maximum security to institute
data protection information. It allows a minimum of 3 backup services to use one online source
and become unavailable due to technical problems or malware attacks (Drozdova et al., 2020).
Threats in the cloud are numerous, and mitigating them requires strict measures for
consumers and providers (Amara et al., 2017). Effective methods against threats consist of
performing regular backups to prevent data loss, strong authentication and encryption
techniques, and a robust Application programming interface (Amara et al., 2017). Strict security
policies and administrative processes hinder malicious insider threats and abusive use of
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
105
technology (Amara et al., 2017). Network traffic regularly needs auditing to monitor activities in
the cloud for unauthorized access or malware.
Cloud infrastructure
Public cloud services are available for customers through a third party and are usually
available for government use. The models are cost-effective for institutions and easily accessible
through the internet. The public clouds extend the scale of information storage; however, they
fail to guarantee privacy in personal information (Joshi, 2016; Labes et al., 2016).
Private clouds are designed and tailored for an organization and linked through the
Intranet. The model offers privacy for information stored as it is easy to implement access
control and restrictions. Access is limited to the facility users when hosting private clouds in an
institution’s data center (Joshi, 2016; Labes et al., 2016).
Private clouds exist on a private network that a cloud provider manages on or off
premises of an entity or the organization itself in its internal data center (Amara et al., 2017).
Private clouds systems are secure as the institutions that own them gain access and control the
services. It is expensive and increases the operational cost to run it. However, it provides data
security and control (Amara et al., 2017).
A hybrid cloud is a model designed for cloud computing that provides confidentiality
and privacy. According to Jain et al. (2016), a hybrid cloud integrates the private cloud with
public clouds. Public clouds will only execute operations that are safe and non-sensitive. Hence,
a selective mode considers the type of data and its sensitivity. For the organizations’ handling of
classified and sensitive data, data extracted arises from the private cloud.
There are many concerns regarding the information on networks and clouds, including
data privacy, confidentiality, data remanence, data integrity, the transmission of data, and
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
106
malicious insiders (Drozdova et al., 2020). Privacy of data is a big concern for cloud computing
because of the safety issues that arise. Technologies like big data and cloud computing offer
attractive solutions, but customers are hesitant because they do not know data location, transfer
data, and operations.
Many organizations are not well informed on the security features offered by these
platforms. To ensure internet and cloud networks provide privacy, they must answer questions to
consumers like the organizations sharing services and how creation and backup of files occur.
Consumers and users also want to know what happens to deleted files and who can access their
data (Huttunen et al., 2019).
Confidentiality is essential in data privacy because it ensures data is only accessible to
authorized users. However, it is challenging to assure users’ confidentiality because of the recent
virtualization and many tenancy properties that most consumers share in a distributed network.
The service providers guarantee confidentiality, and the best solution to ensure it is possible is
encryption and decryption (Kumar et al., 2018). However, the service provider must explain
where the encryption and decryption must occur, the threat occurring during data transfer, and
any lost data by the service provider. In addition, service providers cannot guarantee the recovery
of deleted data from the cloud later, and there is no standard for recycling the storage media.
Removing data remnants includes cleaning, purging/sanitizing, or destruction (Kumar et al.,
2018). Other methods involve overwriting, degaussing, encryption, and media destruction
(Kumar et al., 2018).
Data integrity means protecting data from loss and access from unauthorized users. Many
companies share their application with other tenants, allowing users to share data with
unauthorized consumers, damaging data integrity. Data integrity is crucial because users are
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
107
sharing sensitive information, which, if it gets to the wrong hands, could cause severe damage
(Kurnianto et al., 2018).
Privacy is vital during data transfer from consumers to the cloud. Encryption offers
protection during data transmission, but in most cases, data transmission happens without
encryption or decryption because it takes a lot of time (Lord, 2019). In addition, attackers can
hijack unprotected data that can trace the communication (Lord, 2019). Authorized employees in
charge of management and maintenance of internet services sometimes take sensitive data, hence
known as malicious insiders. In some cases, it is challenging to hold these employees
accountable.
Cloud data encryption means changing or encoding data before setting it in the
distributed storage. In most cases, cloud service providers’ encryption services range from
encrypted association to restricted encryption of delicate data and provide decryption keys when
required (Lord, 2019). For instance, office 365 encryption is an inherent service that encrypts all
messages inside and outside the platform. In addition, encryption services guarantee no access to
data without a decryption key (Lord, 2019).
Many encryption methods protect data, like ensuring data is encrypted before it’s
uploaded. Cloud services encrypt data before it is uploaded automatically, but it is advisable to
encrypt the files if that is not possible. Data encryption secures digital data on computer systems
shared through the internet or other computer networks(Spicer, 2017). The modern encryption
algorithms are better than the traditional methods since they have enhanced IT systems and
communications security. The algorithm offers privacy and manages key security actions as well
as authentication and integrity. Authentication enables the confirmation of messages, and
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
108
integrity provides evidence of unchanged messages. Non-repudiation makes sure no one can
refuse the sent messages (Spicer, 2017).
Solutions offered by data encryption include encryption of devices, emails, and data.
Mostly, the encryption features also contain control abilities. Many organizations have a problem
protecting their data and preventing data loss, especially since sometimes their staff use outside
devices, removable media, and web applications (Spicer, 2017). Organizations cannot control
and guard sensitive information because employees comprise the processes by copying data to
removable devices and uploading them to the cloud (Li et al., 2019).
The most efficient data loss prevention solutions keep data safe and avoid malware from
removable and outside devices and web applications. Protections measures also involve proper
utilization of devices and applications and data security by auto-encryption even when not in the
organization. Email control and encryption are essential to ensure the avoidance of data loss.
Encryption of mails is the best way of providing protection and compliance with set regulations.
Encryption of emails is ideal for remote working, BYOD, and outsourcing projects. Employees
can use an email with effective data loss prevention measures because sensitive data is encrypted
(Singh & Singh, 2017).
Data encryption is essential in every organization and company. There are two main
kinds of encryption: symmetric encryption and asymmetric encryption. Asymmetric encryption
is also known as public-key encryption (Spicer, 2017). Symmetric encryption involves one key,
and everyone involved uses the same key to encrypt and decrypt. In asymmetric encryption, two
keys are involved, where one key is for encryption, and the other is for decryption. The
decryption key is placed privately while the encryption key is open to the public. Asymmetric
encryption is suited for foundational technology known as SSL (Singh & Singh, 2017).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
109
Data is changed through an encryption algorithm hence data transformation into
ciphertext. Algorithms use encryption keys to change the information quickly to ensure that
when encryption appears random, it can change to plain text through a decryption key. The
standard symmetric encryption algorithms include AES, 3-DES, SNOW and asymmetric
algorithms include RSA and Elliptic curve cryptography (Li et al., 2019).
A brute force attack in encryption happens when unauthorized people try to hack without
the decryption key, making many guesses to figure out the key (Lord, 2019). It is common on
modern computers hence the reason for quality encryption that is hard to crack. The use of
contemporary encryption keys and complex passwords help prevent brute force attack. Weak
passwords are dangerous as they are the victim of brute force attacks. Web applications are kept
safe with encryption which is known as transport layer security. Transport layer security has
replaced the secure socket layer. All websites with HTTPS must have a TLS certificate placed in
the origin server (Lord, 2019).
Access control is essential because it ensures that only authorized people handle the data
to avoid unauthorized access by users from outside and inside the organization. An effective
encryption strategy contains reliable access control techniques like passwords, two-factor
authentication, and permissions. In addition, access control requires constant auditing and checks
to maintain validity (Islam & Riyas, 2017).
Security and data privacy requires the right policies to govern all actions taken. A written
policy should happen with the supervision of management, business partners, third parties, and
other stakeholders (Islam & Riyas, 2017). Compliance is essential, and encryption is necessary to
avoid non-compliance. Data encryption is essential to ensure:
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
110
Privacy: encryption ensures all communications and data are safe and reach the data
owners’ expected end-users. It also helps prevent advert networks, internet service providers, and
the government from accessing and acquiring sensitive data (Anant et al., 2020).
Security: encryption minimizes data breaches when moving or in storage. If an
organization loses its devices and the hard drives are encrypted, data will remain intact. In
addition, encrypted communication allows users to communicate without fear of possible
information and data leaks from attackers (Broeders et al., 2017).
Data integrity: encryption stops attacks like on-path attacks. Data transmission on the
network with encryption ensures that information delivered is without any attacks and alterations
(Anant et al., 2020).
Authentication: web users have ways to confirm website owners are compliant with the
set regulations and have a TLS certificate by visiting the TLS website and checking the private
essential listing (Ometov et al., 2019).
Regulations: all organizations holding customer data must follow all the laws, which are
HIPAA, PCI-DSS, and GDPR. Organizations not following these regulatory bodies are in
trouble and unable to handle any users’ sensitive data and information (Islam & Riyas, 2017).
Elzamly et al. (2019) propose a framework that entails utilizing the various types of
clouds, including private, public, or hybrid, and integrating security policies and models in the
system. The framework for mitigating privacy issues in cloud computing environments entails
the following. Achieving data privacy involves having encryption, administrative confidentiality,
and flexible access for data computing services. Cryptography enhances network security; hence
it’s the best practice for cloud services. Institutions must also adopt the use of hyper-visor
security for monitoring cloud activities.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
111
Jain et al. (2016) propose SecCloud as a data storage model that preserves security and
data privacy. SecCloud is a form of storage virtualization whereby multiple storage devices are
combined but create an impression of one single storage system.
Access control management
Unsafe access to data in the cloud poses security threats. A person can access
unauthorized information or leave an opening for hackers to access and launch an attack. It is
essential to limit access to the cloud to safeguard information. Protection of data in the cloud
increases the system’s integrity and ease of access (Gwara et al., 2016).
Encryption
The most common type of encryption for safeguarding emails is the public key, also
known as asymmetric. Public key infrastructure handles key distribution and validation. It
comprises a certificate authority that confirms ownership of the public key. This registration
authority mandates approving the certificate authority before issuing a digital certificate and a
certificate management system. Quality and reliable security require time and patience because
it’s a complex process that takes a lot of time and effort (Lord, 2019). Organizations are required
to have careful consideration of the data sets and encryption techniques. Some elements help in
the creation of high-quality encryption elements that help form successful encryption. \
Attribute-based encryption. The basis of access control is that users with proper
authorization can gain complete access to resources. Attribute-based encryption is a promising
security model to restrict and control access in the cloud computing environment. It offers
security decryption outsourcing, key leakage resistance, audibility of decryption, and limited and
anonymous access control (Ning et al., 2018). Moreover, it applies in cloud computing
environments due to continuous leakage resilience (Li et al., 2019).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
112
The best encryption approach is an attribute-based encryption scheme and ciphertext to
match the user description and successful way of securing information. The proxy reinscriptionbased approach provides ways of converting ciphertext inscriptions using keyGen to provide
security and public parameters. This application deploys cloud computing and data dek and dek
encryption using a public key (Khalil et al., 2018).
Authorization and authentication. The individuals safeguarding customers’
information and protocols are bank employees and security officials (Goswami & Madan, 2017).
Personal information is required to process various transactions and identify the client and their
account. There should be the highest level of privilege in the authentication and management of
accounts.
The rules allocation and opportunities should not affect the segregation of rules. Besides
should be processed in a place that de-provisions credentials. The cloud systems should have
protocols that allow a few people of system privileges. Interapplication connections cloud
services should have authorization and authentication protocols (Sangeetha & Sudha, 2019).
Better authorization and authentication techniques prevent attacks in cloud computing.
An authentication technique such as biometric is a secure form of single-sign-on authentication
(Kumar et al., 2018). Attacks target various security levels in the cloud (Amara et al., 2017). For
instance, Denial of Service attacks affects application level. Attackers make services unavailable
by disabling them or breaking the network. Attackers send data packets that take up the network
and utilize the server’s resources (Soomro et al., 2016).
Homomorphic encryption. It enables computing in the cloud with encrypted data
without conversion into plain text. Homomorphic encryption can be deployed in any setting
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
113
where updating the ciphertext receiver is possible. It allows users to perform multiple operations
with encrypted data, maintaining data security (Potey et al., 2016).
Homomorphism structure prevents a JPG structure such as a ring group and field Galetto,
2015). Fully homomorphic encryption entails the four schemes: keyGen that provides security
and public parameter, encryption, evaluation, and description to give a ciphertext and private key
to the output plain text that ensures maximum safety with user information. It serves customer
DNA service and enhances customer-centric services (Galetto, 2015).
Applicable PSI is used to detect any suspicious cloud information that threatens user
information. Completing PSI theory consisting of decision tree tool solving classification
program machine learning and data-mining provides possible values attributed to different
models with values and transactions between lives and the expected class value for the new
transaction. As a result, the protocol reduces the private finding, the attribute of nodes, and the
highest information gain (Galetto, 2015).
A similar bucket technique for confidentiality and preserving queries used to encrypt data
tests and manage cloud computing is with a secured proxy. Lastly, TPM-based secure HDFS is
the widely used hardware based on security protection techniques and provides maximum
protection to sensitive operations’ environment assets and storage. Trust zone-based solution
technology deployed maximum cloud computing protection to protect confidentiality as
developed by ARM (Khalil et al., 2018).
A software-based data distribution control encryption reacts before sending to the users
without limitation. However, it provides an extra description of the key embedded software. It
shares it with the whole software package that deals with this white box implemented on a
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
114
scripted robbery that plays a crucial role in ensuring the best information is protected. The
authorized personnel receives the description (Khalil et al., 2018).
Storage path encryption. It secures the storage of big data on the cloud (Jain et al.,
2016).
Two-factor authentication. Kuepper (2020) highlights two-factor authentication is a
robust measure of securing customers’ information and bank account details. The model allows
any logins or transactions via the website applications or online banking; a code is sent to the
registered phone number for authorization. Hackers find it challenging to access information on
the account as they require a cell phone and computer. Besides, restrict employees in sharing
account credentials among themselves.
Two-factor authentication schemes offer an extra layer of security and protect accounts
against compromise. Passwords leaks are one of the ways attackers compromise accounts. Many
users use weak passwords or similar passwords and usernames for similar accounts. The twofactor authentication method offers a robust defense whereby even when a hacker steals a user’s
password, they need a single passcode to gain access, and it requires stealing a phone to do so.
Remote hackers' restriction from gaining access to accounts is due to the two-factor
authentication scheme (Reese et al., 2019).
Typically users need to provide two factors before gaining access to their accounts. Two
factors include a password, cell phone token, or biometric. Many organizations have adopted
these methods and require users to enter a single-use passcode sent via SMS to their phones,
hardware code generators, or a time-based one-time password in addition to their passwords
(Reese et al., 2019).
Data processing
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
115
Privacy violations in data processing occur from the failure of technologies applied by an
institution. Utilizing multiple virtual technologies or using too many encryption methods in the
system may generate risks in itself. It arises due to access control management of the information
and providing a dynamic cloud service (Ammirato et al., 2018). Data processing is critical in the
life cycle of big data, and privacy-protecting at this level comes about in two main phases
(Eyupoglu et al., 2018).
In the first phase, when processing data-sensitive information belonging to data owners
may be present, modifications are needed to the data sets to safeguard personal information from
unsolicited disclosure and violation. However, the techniques applied in data modification
should not lose the original purpose; otherwise, the analysis of such data will become useless
(Raju & Aparna, 2018).
Soria-Comas and Domingo-Ferrer (2015) propose data anonymization is a solution to
solve big data privacy issues. Anonymization techniques de-identify personal information, and it
cannot be linked back to the owner. Anonymization alone is limited to protecting against all
forms of threats; however, it ensures that data can be analyzed and utilized to extract insights that
may be valuable to institutions.
Privacy protection in data processing utilizes anonymization techniques and privacy
utility trade-offs. There must be a balance in the level of anonymizing data, and privacy utility
trade-off, which involves data present, can be utilized for an intended purpose. Information loss
causes data utility reduction, which will create challenges as insights garnered are usually invalid
(Goswami & Madan, 2017).
Yu (2016) argues data anonymization has limitations and does not provide sufficient
privacy. Anonymization privacy models include k-anonymity, l-diversity, t-closeness, and
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
116
differential privacy. These models are suitable for satisfying composability, favorable
computation cost, and linkability (Soria-Comas & Domingo-Ferrer, 2015).
The second phase in data processing entails extracting the necessary information without
a violation of privacy. Data mining techniques developed and innovated functions to achieve
data extraction to attain trends and patterns while preserving privacy (Hassani et al., 2018;
Mehmood et al., 2016). Recently, privacy-preserving data mining techniques have been
developed to protect information against threats and unauthorized access.
Privacy-preserving data mining techniques are methods that enable the extraction and
analysis of data while preserving privacy. These techniques provide a certain level of privacy,
allowing data extraction and analysis. Big data has numerous potential and benefits for
institutions. To gain insights, organizations need to perform data mining. The available methods
for preserving privacy during data extractions, as highlighted by Jain et al. (2016) & Mehmood
et al. (2016), include the following:
It is a technique that involves identifying and grouping a new input to classes they belong
to in data sets. In classifying data, the methods proposed are the Bayesian formula and random
reconstruction techniques (Jain et al., 2016). The methods achieve privacy; however, their
limitation is challenging in large, centralized diverse data sets.
Privacy-preserving clustering. It is a common technique utilized in many sectors to
analyze unstructured and unfamiliar data. Clustering algorithms can manage to preserve privacy
in centralized small data sets using low-order statistics. This framework, however, is not
applicable in data sets that are complex and decentralized.
(Mehmood et al., 2016).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
117
Privacy-preserving association rule mining. The techniques find critical relationships,
trends, or patterns in input data. In privacy protection, this technique prevents the extraction of
sensitive information. Achievement of association rule mining is through the distortion of
original data; thereby, gaining insights into the data is possible without requiring the original
data’s values (Mehmood et al., 2016).
Goswami and Madan (2017) state they are traditionally based on novel methods to
preserve data. They enable the use of data by institutions as they provide privacy in big data as
mandated by law. Traditional methods have limitations, and with increasing, big data collection
institutions should adapt to the new techniques. Many scholars have identified these methods
provide better privacy to information in use (Zhang, 2018). Privacy-preserving data mining
techniques’ purpose is to prevent data misuse during data mining and the value of information
preserved. Methods include perturbation, anonymization, and cryptography.
Perturbation
Perturbation involves modifying data through altering data values using some
mathematical procedures. The process application rearranges the data matrix, subtracts values, or
adds unfamiliar values (Taric & Poovammal, 2017). Distortion of data preserves sensitive data
that does not belong to specific individuals. It provides anonymization to data sets and is an
efficient process. The use of perturbation requires careful modification to preserve the usefulness
of data to gain insights.
Cryptography
The cryptography tool theory entails confidentiality to protect the general big data
application. Homomorphic encryption for the general-purpose data processor consists of an
alleged conspiracy to protect the user from inscribing a logarithm within script information
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
118
before sending it to apply for a processing fee on this application. As a concern, this is a
preventative measure to cloud processing and morphic inscription techniques that develop a
solution on dynamics and achieve confidentiality protections and data prospects simultaneously
(Galetto, 2015).
Cryptography is gaining attention as a technique for privacy protection in big data (Yu,
2016). Cryptography secures sensitive data attributes, offering security and preserving privacy
(Taric & Poovammal, 2017). Cryptography applies protocols that allow data computations, and
parties only get output and cannot learn about inputs used. Cryptography applies authorized
encryption and predicate encryption techniques to protect privacy (Singh & Singh, 2017).
Data anonymization
Data anonymization entails a process whereby there is a generalization to clean and
suppress data (Mehmood et al., 2016). The process is also known as de-identification and
involves the replacement of identifying attributes or names. It provides the necessary protection
to personal information enabling utilization in big data analytics. Data anonymization enables
institutions to utilize data and share it with other entities to research or innovate new services
without compromising consumers’ privacy (Sei et al., 2019).
In the database, categorization of attributes is in various ways, which generally include:
Explicit identifiers. It entails attributes unique to an individual and can differentiate
someone’s information and account for another person. This information involves names,
identity card numbers, social security numbers, and driver’s license numbers (Jain et al., 2016;
Sei et al., 2019; Zhang, 2018).
Quasi-identifiers. It refers to attributes such as age, gender, and home address, used with
other information to identify a person (Jain et al., 2016; Sei et al., 2019; Zhang, 2018).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
119
Sensitive identifiers. Refers to attributes that are usually sensitive to data owners that
they prefer not to disclose to other people or allow a limited number of individuals to access such
information. The information by law is not available for the public or accessible in public
directories. Such information includes the amount of money in a person’s account and credit
score. Exposure to this information can harm and increase security concerns (Jain et al., 2016;
Sei et al., 2019; & Zhang, 2018).
In de-identification, various models are available to ensure the privacy of data is
maintained. Many literature studies highlight the following methods to preserve data privacy
(Eyupoglu et al., 2018; Goswami & Madan, 2017; Jain et al., 2016; Sei et al., 2019; Tu et al.,
2019; & Zhang, 2018).
K-anonymity
Achievement of K-Anonymity in datasets is through suppression, and generalization,
whereby the data has anonymization in that particular data set, with k-I attributes matching the
details (Goswami & Madan, 2017). For instance, using suppression in a table, religion columnspecific information is replaced with an asterisk (*). For generalization, age is the common
attribute generalized with a range below or 25 years of age. The drawbacks of this model entail it
cannot protect data during a background attack or temporal attack (Eyupoglu et al., 2018). Also,
it is challenging to render multiple private records anonymous while minimizing the information
released (Wang et al., 2018).
L-Diversity
It is a model that removes granularity in data sets, hence providing privacy. It’s an
extension of the K-anonymity and overcomes some of the challenges and attacks the model
cannot handle (Eyupoglu et al., 2018). It maps any given record onto k different records in the
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
120
datasets. The model is dependent on a wide range of sensitive data to offer the ‘L’ attribute to
values (Sei et al., 2019). The drawbacks faced while utilizing this method entail a similarity
attack or skewness attack. In similarity attacks, the values of attributes are usually similar but in
presentation appear to be different.
T-Closeness
This model integrates k-anonymity and l-diversity and further reduces the granularity in
data representation. Furthermore, it prevents attribute disclosure, benefiting the institution while
representing data to ensure privacy (Tu et al., 2019). The assignment of an equivalence class of
records to have t-closeness results when the distance between the sensitive attribute and the
attribute in the whole table is equal or less to a threshold (Wang et al., 2018).
Differential privacy
Differential privacy is a recently established technology whereby a database analyst can
acquire information without compromising personal information. The technology works by
introducing a minimum distraction that adds noise to datasets large enough to ensure individuals’
identities remain hidden. At the same time, the analysts can still gather adequate information for
analysis. The model does not grant direct access to personal information, and instead, in the
database, a privacy guard is present that separates the information (Wang et al., 2018)
An analyst can only utilize the authorized data, which is not sensitive through the privacy guard.
Therefore, an analyst to carry out the activities in the database undergoes the following steps:
(Wang et al., 2018)
Step 1-entail making a query to the database through privacy guard
Step 2- the privacy guard will evaluate the query made for privacy risk
Step 3 -the privacy guard, after evaluation, acquires the answers
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
121
Step 4 -Provides solutions to the analyst, which will add distortion depending on the privacy risk
assessed and found. The magnitude of distortion is equivalent to the proportionality of the risk. If
the query affects privacy considerably, then the distortion will significantly affect the quality of
answers.
Privacy Legal Mechanism
The collections of personal data from consumers arrive with greater responsibilities for
institutions. Currently, there are several laws governing consumer privacy during data collection
and utilization. The regulations mandate financial institutions to protect the privacy of financial
data collected, which also extends to the use and sharing of other parties (Swinnen, 2018).
Privacy in big data is of legal significance to lawmakers due to its magnitude.
Many challenges arising from big data arise from technical issues and privacy-preserving
techniques (Goswami & Madan, 2017). In addition, privacy issues arise from consumer privacy,
and the legal repercussions that result enforce all entities to maintain this policy. Most mobile
applications have a privacy policy agreement that users accept before gaining full access to the
App. The method is known as a clickwrap type of agreement that is enforceable, and all users
must accept the terms of the contract. The best business practice entails having a privacy policy.
Confidential information entails the information belonging to third parties and not the
entities. Hence, financial institutions hold a lot of sensitive information and must comply with
laws and regulations to ensure they do not breach the data owner’s confidentiality or rules
(Hasan et al., 2020). Furthermore, privacy regulations are evolving, and with increased data
collection and technology advancements, consumers have more control of their data.
The General Data Protection Regulation. The General Data Protection Regulation
(GDPR), first published in 2016 and implemented in May 2018, gives consumers more control
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
122
over how their personal data is used and secured by institutions in the European Economic Area.
Consumers have a right to access information on how companies handle their data, rectification,
erasure, restrict processing, and portability of their personal data (OECD, 2020).
The GDPR regulates organizations in data collection, storage, and sharing. It promotes
personal data privacy and controls data transmission electronically, extraction and analysis. For
handling personal data, GDPR requires organizations to achieve the following: “lawfulness,
fairness and transparency; purpose limitation; data minimization; accuracy; storage limitation;
integrity and confidentiality (security); and accountability.” Organizations that fail to comply
with these rules are fined heavily (OECD, 2020).
The Gramm-Leach Bliley Act. Information that many perceive as private is regularly
sold and bought by financial institutions. The Act includes information such as account numbers
and balances. The goal of the Act is to govern and control the sale of personal information.
Earlier, different financial institutions would merge, and after consolidation, they would gain
access to various aspects of a customer’s personal life (Gonzalez, 2015).
The act's provisions toward data privacy entail companies should secure financial
information properly. Secondly, consumers should be given a detailed account and educated
toward the existing policy on sharing personal information. Lastly, consumers have the right to
opt-out to sharing certain information they fill, which is sensitive and personal (Gonzalez, 2015).
The formulation of the regulation was to improve data protection in the financial sector. The
GLBA regulation requires the companies providing financial services to the consumers such as
insurance, loans, investment, or financial advice to explain how the company will use their
personal data appropriately. The regulation also requires financial companies to disclose how
they safeguard customers' sensitive data (FTC, 2021).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
123
According to the GLBA regulation, the definition of a financial institution is complex. It
includes the businesses that might not be considered financial institutions themselves. The
regulation applies to all companies extensively involved in providing financial products and
services regardless of their size. Examples of such businesses are payday lenders, nonbank
lenders, check-cashing providers, property appraisers, mortgage brokers, courier services, and
professional tax preparers (FTC, 2021). The regulation also applies to ATM operators and credit
reporting organizations because they collect customers’ data.
In addition to financial companies adhering to the GLBA regulations, they need to ensure
that their service providers and affiliates conform to the guidelines. According to the Safeguard
Rule, financial companies should create an information security plan that describes the program
they intend to protect consumer data. This plan has to correspond to the company’s size,
complexity, scope and nature of activities, and the sensitivity of protected data (Gonzalez,
2015).
Therefore, a couple of requirements is necessary for the company’s plan (FTC, 2021).
The first requirement is the designation of a single or more employees to coordinate an
information security program. The second requirement is identifying and assessing the risks that
the customer data might face. It should also include an evaluation of the current security
regulations. The third requirement is the design and implementation of the safeguard program
and regular monitoring and test plan. The fourth requirement is a selection of service providers
that can maintain security in the company. The last requirement is evaluating and adjusting the
program based on relevant situations that might affect the company’s operations (FTC, 2021).
These five requirements need a flexible design. It helps the companies implement
protection methods that are appropriate for their privacy needs. For instance, some companies
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
124
might put their security plan on a single document, while others require several documents.
Others might only need one employee to coordinate security on customer data, while others
might delegate the duty to several employees who should work as a team (FTC, 2021).
California Consumer Privacy Act. The act provisions entail a law that gives consumers
control of the utilization of personal information. The California Consumer Privacy Act (CCPA),
implemented in January 2020, gives consumers the right to regulate and request how companies
use their data and general selling to other entities or purposes. Organizations doing business in
California and making profits from holding or selling consumers’ personal information must
comply with this measure. The measure is currently a consumer–privacy regulation in California
and organizations found not to comply are fined (Anant et al., 2020).
Institutions that violate the act $7,500, and $ 2500 penalty applies for intentional or
unintentional actions. The requirements in the law include that institutions should protect all data
collected via the internet. Before processing personally identifiable information, consent from
the owner should be sorted. While developing and implementing data security, institutions
should comply with all requirements as provided by the bill. If an individual closes an account
with the bank, the collection of personally identifiable information should cease within 30 days
(the State of California, 2018).
Related works
The security framework proposed for financial institutions by various scholars on big
data management has mainly been to enforce security while utilizing the technologies. A similar
research study is by Gwara et al. (2016), titled “A Framework for Assessing Cloud Computing
Security for Cloud Adoption in Microfinance Banks.” The research’s main objective was to
develop a framework that can assess the security level in the cloud computing environment,
enabling financial institutions to evaluate cloud services before acquiring them.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
125
The researcher framework targets the following critical aspects while utilizing the cloud
computing environment: security, legal and administrative management, access control
management, cloud infrastructure, data level management, and network level (Gwara et al.,
2016). Each component, such as network-level challenges identified, includes network attacks
and equivalent measures to counteract them. The framework’s implementation applies during an
assessment of clouds from the providers, where identification of various challenges occurs. The
equivalent countermeasure on the framework is valuable and any other additional security
control measures as required. The study’s limitation is that the security measures are limited to
data storage only in big data utilization.
Another framework related to this research is Zhang et al. (2018) titled “Big data Privacy
protection Model Based on Multi-level Trusted System.” The model proposed by the researchers
introduces a multi-level system of encrypting privacy user data by prioritizing the high risk.
Hence a leak in the low-priority data does not comprise the high-priority data privacy. The
systems model is against the trojan horse virus, and the categorization of risks is into seven
different risk levels. One is the lowest, while 7 is the high-value user data privacy (see table 1).
In utilizing this model, the institution requires seven encryption algorithms executed efficiently
according to high priority. Prioritizing user data privacy is based on the cost of loss or breach of
the information.
Drozdova et al. (2020) propose an architectural framework that preserves data privacy in
cloud computing systems. In the framework, the main components include risk management and
security policies. Addressing the risk that compromise systems are paramount in security. Risk
management involves risk assessment and treatment. The process begins with risk identification.
In risk assessment, there is the identification of threats and vulnerabilities for cloud computing
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
126
assets. Assets in cloud computing include data, infrastructure as a service (IaaS), network, cloud
management, virtualization, backup, and logs (Drozdova et al., 2020.)
Summary
One of the major problems with big data applications is privacy issues associated with
security protocols implemented while handling the information. There is a lot of potential with
applying knowledge into the business to gain success. Increasingly the understanding of financial
markets has been rapidly becoming vital. Constantly financial institutions are using data to make
decisions involving tax reform risk analysis and fraud detection.
Big data is altering the finance sector in significant ways: transforming the banking
culture, creating transparency, algorithmic trading, risk analysis, and leveraging the customers’
information acquired (Hasan et al., 2020). Besides, big data can potentially influence economic
analysis and modeling (Hasan et al., 2020). Despite the critical foundation for information
security for data protection, still, privacy is a concern.
Many privacy and security tools overlap, and some measures to curb cybersecurity pose
privacy threats. Enhancement of information security and urgency in the institution is vital to
address possible and future threats. Lack of proper protection and increased concern on privacy
while using data for an institution may increase pressure on changes and regulations to deal with
consistent cybersecurity attacks and threats. Besides, in the literature review, the major themes
associated with big data privacy identified include the following:
Policy requirements: Policies are essential and govern all staff’s ability to assess systems
and information. Policies are essential in enforcing measures that prevent employees from
engaging in activities that threaten and compromise systems, such as opening suspicious emails
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
127
and leaving the systems open after use (Broeders et al., 2017). The policies should have clear
guidelines on the restriction measures that every employee is mandated to follow.
Laws govern and regulate data privacy, and institutions need to meet the various acts
governing personal data usage requirements. Infrastructures are essential in handling big data;
hence privacy entails having the latest security measures that ensure the network cannot open to
hackers and the firewall is tight and robust (Broeders et al., 2017). Besides, the data stores need
to be developed and secured with existing frameworks. The design of protocols should entail
appropriate restrictions for easy accessibility to information and systems, limiting improper use
and unauthorized access.
Banks have adopted open models to attract and retain current customers (Lessambo,
2020). The significant challenges of this application are customers’ trust in sharing information
with each other. Increased data growth and the potential it holds to benefit the business if
analytics is enormous; hence banks need to improve their security platforms to surpass the
privacy issues that limit the data utilization.
The current security frameworks have targeted cybersecurity issues only and enhanced
the systems to detect and prevent them. However, big data can detect and predict vulnerabilities
in systems and provide better insights into areas that require reinforcement. However, the
privacy issues with the application of big data are still a significant concern globally, and
regulations control personal information utilization. Thus, this research provides new
information and security measures to manage the big data privacy issues to enable financial
institutions to attain this technology’s full potential and capability.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
128
The following section, methodology, explains the research paradigm chosen for the
study, data collection methods and sampling techniques, and the analysis methods for data
collection.
Chapter Three
Introduction
This chapter describes the methods and procedures utilized in data collection to answer the
research question. The aspects discussed in this chapter include the research paradigm, research
design, sampling methods, data collecting method, statistical tests, and approval methods for the
study. Also, the chapter describes the achievements of outcomes in line with the research
objectives. The methods described aimed to achieve a collection of information that is unbiased
and accurate. Besides, they also attain neutrality, reliability, validity, generalization, which enables
successful research.
Information technology governance is vital in managing risks and focuses on enabling
operations to run efficiently in an organization. An information framework is a roadmap that
describes efficient methods and procedures in controlling risks, aligning business activities with
the objectives, and ensuring an organization meets compliance regulations (Calder & Moir, 2009).
The study aims to provide a security framework to assist the organization in managing privacy
issues. Hence, developing a framework involves overriding the generic systems and creating a
concrete one, or changing a few areas and integrating it into the organization to bring the intended
change.
All the financial and government institutions need to implement new ways of securing
personal information from the current means. Failure to protect customers' intellectual property in
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
129
the systems is consequential and reduces most government and financial companies' credibility
and integrity. As a way of improving intellectual properties in banking institutions, there is a need
for strong measures. The system faces many challenges to meet information-sharing frameworks
that attract the organization to criminal operations in Banks.
The research methodology will provide valuable data obtained from various sources using
different research approaches to gather insights for a comprehensive framework.
The Research Paradigm
The study was descriptive research seeking to provide solutions for problems. The
research adopted the qualitative methodology. Qualitative research typically utilizes a small
sample. In applying the methods approach, a researcher can perform inferential statistics, a
quantitative approach to quantify results, and analyze the narrative information (Etikan & Bala,
2017). A qualitative approach provided insights from literature studies to back up the theories
developed in creating the design (Etikan & Bala, 2017).
The qualitative research method validated the new framework that protects personal user
data. The real-time big-time intelligence and analytics helped secure and provided visual patterns
of activities with the user information and learn from other mistakes to detect and prevent cyber
threats that are a challenge in big data. The qualitative approach included data collection through
an interview method to gain insights and options from an expert perspective. It is imperative to
identify the current measures undertaken by a management team to secure data that need some
implementation and use the latest data security measures to protect user information in the bank
institutions.
Qualitative approach
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
130
The qualitative approach allowed the examiner to evaluate people's experiences
concerning a particular topic through research methods such as interviews, case studies,
phenomenological studies, focus discussions, content analysis, and observation (Hennink et al.,
2020). Qualitative research allows for discovery (Williams, 2007). The approach enables a
researcher to identify the issues directly from the participant's point of view and understand the
interpretation and identities they give to objects and people.
Qualitative research aspires to understand the contextual influences of things in their
natural settings. It gives inductive reasoning and purposeful descriptions of items to develop new
theories (Williams, 2007). One aspect of qualitative research is that the study's purpose is to
understand why certain things need to be in a particular way, their influences, and how they
should be. Another aspect is the data obtained is in words, and the analysis of such data is
interpretive. Also, study participants in qualitative research are small, and the sampling method
is usually purposive. Data collected in this type of research explains a particular phenomenon
and actions (Johnson & Onwuegbuzie, 2004).
A qualitative research method is “a market research method that focuses on obtaining
open-ended and conversational communication” (Williams, 2007, p. 67). This research method
mainly focuses on the way people think rather than what they believe. For example, we can
observe that more women than men visited a convenience store. While conducting qualitative
research in such a scenario, we will establish why men avoid visiting convenience stores more
than women. The qualitative research method is descriptive (Williams, 2007).
Social and behavioral science is known to be the reason for the existence of qualitative
research methods. It is difficult to understand people in today's world, and therefore qualitative
research methods come in handy. The improvement of technology has also increased the way
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
131
qualitative research is conducted (Williams, 2007). The qualitative research method helps a
researcher establish a given group of people. Qualitative research has various methods. They
include interviews, focus groups, ethnographic research, content analysis, and case study
research.
One-on-one interview method
One of the most usual ways of conducting qualitative research is through in-depth
interviews. In an interview, there is one on one interaction with the subject. The interview
involves one subject at a time. It always increases the in-depth information that the respondent
provides. Despite one-on-one interaction with the respondent during interviews, different
methods and approaches used to conduct interviews create different types of interviews
(Williams, 2007).
The first interview method is the behavioral-based interview. The basis of this type of
interview is the interviewer’s experience and behavior. The second type of interview is the case
interview. In such an interview, the interviewer presents a case scenario to the subject. Then the
interview at the subject question regarding the scenario presented (Taylor et al., 2015). The
candidates answer questions with a proper solution. Also, the subjects can ask questions
regarding the scenario.
Interviews are considered one of the most reliable research methods. There are many
reasons behind this argument which you can simply term as the advantages of the interviews. As
stated earlier, the researcher has more one-on-one interaction with the subject (Taylor et al.,
2015). Therefore, there is a high chance that the respondents will answer every question
presented by the interviewer in this case. It increases the reliability of the data. Also, interviews
are time-saving (Taylor et al., 2015).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
132
Having one-on-one interaction with the subject makes the interview effective because the
router or the interviewer can better understand (Taylor et al., 2015). the researcher can follow up
on some of the answers given by the candidate for further clarification. Similarly, in the same
scenario, the researcher can observe the subject's reaction (Taylor et al., 2015). The way the
subject reacts clarifies whether the information given is accurate or not. Also, this provides
flexibility to the interviewers. It means that the interviewer can change into more suitable
questions depending on the scenario at hand (Taylor et al., 2015).
The researcher can also determine the environment in which you can conduct the
research or the interview. In most cases, interview methods such as email, the researcher have
very little control of the environment on the subject (Taylor et al., 2015). Therefore the subject
can be destructive and does not provide the best response to the question asked. On the other
hand, in one-on-one interviews, the researcher can decide the location of the interview, which
can be a quieter place with less distraction (Taylor et al., 2015).
Despite the advantages of interviews, some shortcomings e associated with including
interviews are that it is costly (Taylor et al., 2015). As stated earlier, interviews involve one-onone interaction with the subject. When a participant is far, the researcher is required to travel to
reach their subjects. It translates to the factor of time. Interviews are time-consuming. As the
researcher travels, he spends a lot of time moving from one respondent to another.
Secondly, interviews can be biased. In most cases, the respondent reacts depending on
their interview. For example, the rest of the interviewer can determine how the respondent
responds to the questions. Other factors that can affect how the respondent reacts are the physical
appearance and age of the interviewer (Liamputtong & Ezzy, 2005). The third reason why
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
133
interviews are not compelling is because interviews today provide left-around anonymity, which
is a big concern for many respondents.
In one-on-one interviews with the respondent, it can be challenging to access the
respondent (Liamputtong & Ezzy, 2005). Some respondents are too far, making it difficult to get
excited to read them and ask the questions they desire (Liamputtong & Ezzy, 2005). It is not
easy, and the respondent is in a different country. As stated earlier during the interview, it will
take a lot of time and money to move around when they decide to visit the respondent today.
Focus Group
The second most used qualitative research method is focus groups. In a focus group,
respondents are usually limited to 6 to 10 people within a target market (Liamputtong & Ezzy,
2005). The main questions asked during the focus group were why what and how questions. The
kind of data provided during this research is descriptive, and therefore this means no numerical
measurements.
In different words, a focus group is a structured lens interview. The significant difference
is that they are more of a focus group than merely collecting similar data from many participants
at once. A similar topic applies whenever a researcher uses a focus group where people respond
simultaneously. The researcher records the response and monitors the respondents (Liamputtong
& Ezzy, 2005).
Focus groups and mainly used when collecting views towards a given topic, are required.
They are also crucial in providing a detailed understanding of the subject's beliefs and
experiences. There are different criteria used in the research method. They include a stand-alone
process and multi-method design (Liamputtong & Ezzy, 2005). When conducting research using
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
134
the focus group method, the size of the group is significant. A large group is generally
discouraged because of the effort required in managing these groups.
Many researchers prefer a focus group because of some of its advantages. One of the
main reasons the researchers prefer focus groups is the respondent's response immediately to the
questions. It is crucial because there is an instant understanding of the issues raised by the
respondents. A Focus group helps the researchers understand some of the met needs
(Liamputtong & Ezzy, 2005). Also, during the focus group, one can discover concerns that
seemed unimportant at the beginning of the focus group. When different respondents raise the
same questions, then it means that there is a problem involved or experienced in the particular
area.
This research method is also flexible. In most cases, the researchers do not stick to a clear
path of interaction with the respondents. Therefore they never are given concern, or the question
does not offer enough answers, then the researcher can ask more questions regarding the matter.
Consequently, this makes it a suitable research method because the researcher can get in-depth
information concerning a given topic (Liamputtong & Ezzy, 2005).
Although research groups are beneficial to the researchers, they have some disadvantages
and shortcomings. First of all, the research outcome can be biased. Some of the factors that can
cause biases in the research outcome are their raised age and the class of the researcher. Second,
secondary focus groups are more expensive than executive valet services. In most cases, the
participants require compensation for their participation in the discussion (Liamputtong & Ezzy,
2005). Therefore it means that the researcher will need to spend money on the preparation and
payment of the participants.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
135
Secondly, most focus on the participants who do not get a chance to air their own opinion
freely. This is because some of the members of an interview group are shy, and maybe they may
find it hard to air their views freely. Therefore, the quality of information collected will be poor
in the end. It is hard to get an honest opinion on sensitive topics (Liamputtong & Ezzy, 2005).
Some of these participants are hesitant to answer sensitive questions.
The four disadvantage of research that focuses on focus groups is that the group selected
may not be an actual presentation of a given target group. As a result, the chosen participants
may not represent the whole society as expected (Liamputtong & Ezzy, 2005). This makes it
time-consuming and costly because follow-up research is always required.
Ethnographic Research
Ethnographic research considers as the preservation method. This method observes
people and their naturally occurring environment (Sofaer, 2002). For this to be possible, the
researcher will be required to live with the target group for some time. It occurs for the
researcher to adapt to their environment. The main goal of the ethnographic research method is
to comprehend the traditions and culture of a given group of people. In most cases, the researcher
must stay with a given research group for months to years (Hennink et al., 2020).
To many researchers, the ethnography research method has some advantages. For
instance, it helps the researcher identify some of the unexpected issues found in a particular
society (Hennink et al., 2020). Moreover, it is easy to locate some unforeseen matters because
for the researcher to conduct the ethnographic research successfully, they have to blend in with a
given community (Hennink et al., 2020). Therefore, observing them in their natural environment
makes it easy to make actual observations.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
136
The second advantage of ethnography is that the researcher presents detailed information.
The researcher can successfully observe the natural attitude and behavior of the community and
the research (Sofaer, 2002). Some of these attitudes recorded in a tampered environment are
challenging. Since there is no control of subjects during the study, they will act; naturally, they
forgive the researcher an opportunity to record correct and accurate information.
Ethnography also has some demerits. One of the significant disadvantages of
ethnography is that it is time-consuming (Sofaer, 2002). The decision leads to a lot of time to
blend and become part of society. It can take months and years for the researcher to fit in entirely
and make observations (Sofaer, 2002). In addition, this is because most of the members of our
Society will always note when a new person has joined them. Therefore they will not act
naturally.
Since there is a direct interaction between the researcher and the community, there are
some risks involved. When it comes to the researcher, he needs to be qualified and well-skilled
to avoid falling into the trap of ethnicity and biasness. Ethnicity can significantly determine how
the researcher interacts with the community. The researcher can be tribal and form lucks to
deliver proper format observation in most cases. Therefore there is a higher chance that the
observation made will be subjective (Sofaer, 2002).
Case Study Research
In the past few years, many researchers have used the case study research method in
conducting research. As the name suggests, case study research describes an organization,
phenomena, or entity. In most cases, a case study applies in education, social science, and other
similar fields. Case study research looks like complex research to conduct away, but this is not
true. On the contrary, it is considered one of the easiest ways to conduct research. It involves a
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
137
deep-dive thorough comprehension of the data collection techniques and varying the data
(Sofaer, 2002).
When a case study describes and gives an event, the outcome is always descriptive and
exploratory. For the researcher to conclude, one must use multiple sources of information and
four sources of evidence (Sofaer, 2002). Real stunts, in most cases, case studies are also used to
explain why people behave or respond in a given way to a particular policy.
Case study methods are into different categories. The first category is the collective case
study. A group of individuals is in the study in a collective case study. The researcher can choose
to study the entire community or focus on a small group of people. The second case study
method is a descriptive case study (Sofaer, 2002). In this case, the descriptive theory applies. As
the name suggests, a descriptive case study requires the researcher to observe the given
community and describe their observations.
The third type of case study is the explanatory case study. It makes a casual investigation.
In this case, as the name suggests, the researcher primarily focuses on the factors that cause
people to behave the way they do (Sofaer, 2002). Finally, 4th is an instrumental case study. In
this case, the research to the group allows the researcher to make observations more than an
average person would.
One of the advantages of a case study is that it turns observation into usable data.
Secondly, some of the opinions known about a given group of people turn into facts. Several
different research methodologies can be used (Sofaer, 2002). For instance, questionnaires can
make observations. Also, cases recorded in the park are valid for research. Some other
information, such as those in diaries and journals, can also be used in the case study (Sofaer,
2002).
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
138
The other advantage of a case study is remote data collection. The researchers do not
need to access a given location to make their study. They can obtain information through phone
calls or emails (Sofaer, 2002). Also, when doing interviews during the case study, a researcher
can use a phone call for data collection. A case study is also inexpensive. In most cases accessing
data required in a study does not need money. The researcher also does not need to use the funds
for traveling or compensation when interviewing through phone (Sofaer, 2002).
Record Keeping
Research record-keeping is a close and careful recording of the clear document and
transparent management of the same. Record keeping is essential for checking scientific
misconduct, checking validity, providing intellectual property, and replicating research (Sofaer,
2002). Record keeping ensures that the research conducted is ethical. Keeping good records of
research confirms that conducting the subsequent research is quickly done. The recording shows
that the researcher has an opportunity to perform complex analysis. This is because the data
needed is already provided and recorded (Hennink et al., 2020).
Record keeping can only be vital if they follow a particular set of guides. First, these
records are supposed to be available whenever they're needed. A researcher needs to make a
clear recording (Hennink et al., 2020). A researcher is also required to arrange the information in
a sequenced manner. Record-keeping has many advantages. Keeping Records ensures that the
researcher can monitor the progress of the research. This is because detailed information is
provided and in sequence, making it easy to follow through (Hennink et al., 2020). Also, record
keeping shows that the researcher can identify areas they have not dealt with as far as
researching is concerned. Therefore focuses on areas not well researched.
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
139
Research records are essential to current and future researchers because they provide a
reference point. Recordkeeping includes reliable information. As stated earlier, record-keeping is
similar to going into a library. Therefore most of the research work that the researcher picks is
reliable.
Process of observation
It refers to the procedure used to observe and record a set of activities or behavior. In
most cases, the observation can be either formal or informal. These observations are categorized
as participant observation, natural observation, and controlled observation.
An observation to be practical requires conducting it in an organized manner (Hennink et
al., 2020). Following strict guidelines for the information to be practical. When observing,
keeping a record is an integral part of the process. One of the significant advantages of
observation is that it provides a reliable set of data. Also, the information provided is firsthand
(Hennink et al., 2020). Observation is also less expensive. This is because the researcher does
not need to use the money to conduct a proper observation constantly. All that they need is an
appropriate set of skills.
The most significant disadvantage of observation is that it is time-consuming. The
researcher needs a lot of time to observe a given trait in the community and proper recordings. If
time is limited in this research, there's a high chance that the data collection process will be less
effective. Also, the researcher needs to compare different works to come up with that concrete
conclusion. This is time-consuming. This method can also be expensive because the researcher
will need to observe these groups from one place to another.
This research utilizes the interview as a method of data collection. There are many
reasons why the interview is the best choice. One interview is the best way to obtain detailed
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
140
information from the respondents. The researcher at any time can ask for more information
concerning a given topic. Also, the questions asked are more detailed, making the respondents
offer precise answers. Open this case; there's a high chance of a high response rate.
Since the researcher can make observations, they can determine whether the client is
comfortable with the question or not. They can decide whether to continue asking the questions
or not (Hennink et al., 2020). Body language, in this case, is imperative, observed during oneon-one interviews.
In line with a qualitative research method's concepts, it is appropriate for this study to
provide information on individuals' experiences with big data and their challenges while utilizing
and managing it. The proper method for obtaining this information is a phenomenological study.
A phenomenological study focuses on participants' experiences and perceptions of specific
situations and things (Leedy et al., 2019). Data collection is through interviews, which is the
primary method intended for this research.
The study also seeks to learn the current systems for managing big data in financial
institutions and their shortcomings. In addition to the participant's opinions and views, content
analysis will also provide information to achieve this research's objective. Content analysis
entails examining publications and books to identify patterns and themes that explain the
research questions (Leedy et al., 2019). The study chose a depth interview as the research
method. Participants' experiences needed to provide insights and input on the research questions
and mainly describe their experience managing big data.
Research Design
Research design is critical in research as it influences the achievement of research objectives
by obtaining evidence (Sileyew, 2019). Successful research requires a straightforward process to
INFORMATION SECURITY FRAMEWORK FOR BIG DATA
141
effectively address the problem identified and come up with applicable recommendations.
Research designs are intricate and need considerable time to select the best to ensure firm and
convincing conclusions. Comprehensively the study will offer solutions to the research problem
(Sileyew, 2019). The research problems guide many research designs, including action-research,
causal, cohort, cross-sectional, descriptive, exploratory, longitudinal, observational, and
philosophical (Kothari, 2004).
The study utilized a descriptive design to explain the framework that minimizes Big Data
privacy issues in the financial institution and their impact. A descriptive research design
systematically describes a situation, program, and phenomenon from individuals, experts, and
organizational perspectives (Sileyew, 2019). Besides, it entails a process of data collection to test
a hypothesis about the aspects under study.
Descriptive research provides statistical information and is dependent on instrumentation
and measurement techniques (Sileyew, 2019). The descriptive design yields vibrant data and
provides precious recommendations (Nassaji, 2015). The design also allows for collecting...