Due to the high volume and velocity of big data, it is an effective option to store big data in the cloud, as the cloud has capabilities of storing big data and processing high volume of user access requests. Attribute-Based Encryption (ABE) is a promising technique to ensure the end-to-end security of big data in the cloud. However, the policy updating has always been a challenging issue when ABE is used to construct access control schemes. A trivial implementation is to let data owners retrieve the data and re-encrypt it under the new access policy, and then send it back to the cloud. This method, however, incurs a high communication overhead and heavy computation burden on data owners. A novel scheme is proposed that enable efficient access control with dynamic policy updating for big data in the cloud. Developing an outsourced policy updating method for ABE systems is focused. This method can avoid the transmission of encrypted data and minimize the computation work of data owners, by making use of the previously encrypted data with old access policies. Policy updating algorithms is proposed for different types of access policies. An efficient and secure method is proposed that allows data owner to check whether the cloud server has updated the ciphertexts correctly. The analysis shows that this policy updating outsourcing scheme is correct, complete, secure and efficient.
1. SECURE AND VERIFIABLE POLICY UPDATE OUTSOURCING FOR
BIG DATA ACCESS CONTROL IN THE CLOUD
ABSTRACT
Due to the high volume and velocity of big data, it is an effective option to store
big data in the cloud, as the cloud has capabilities of storing big data and
processing high volume of user access requests. Attribute-Based Encryption (ABE)
is a promising technique to ensure the end-to-end security of big data in the cloud.
However, the policy updating has always been a challenging issue when ABE is
used to construct access control schemes. A trivial implementation is to let data
owners retrieve the data and re-encrypt it under the new access policy, and then
send it back to the cloud. This method, however, incurs a high communication
overhead and heavy computation burden on data owners. A novel scheme is
proposed that enable efficient access control with dynamic policy updating for big
data in the cloud. Developing an outsourced policy updating method for ABE
systems is focused. This method can avoid the transmission of encrypted data and
minimize the computation work of data owners, by making use of the previously
encrypted data with old access policies. Policy updating algorithms is proposed for
different types of access policies. An efficient and secure method is proposed that
allows data owner to check whether the cloud server has updated the ciphertexts
correctly. The analysis shows that this policy updating outsourcing scheme is
correct, complete, secure and efficient.
2. INTRODUCTION
Big data refers to high volume, high velocity, and/or high variety information
assets that require new forms of processing to enable enhanced decision making,
insight discovery and process optimization. Due to its high volume and
complexity, it becomes difficult to process big data using on-hand database
management tools. An effective option is to store big data in the cloud, as the cloud
has capabilities of storing big data and processing high volume of user access
requests in an efficient way. When hosting big data into the cloud, the data security
becomes a major concern as cloud servers cannot be fully trusted by data owners.
3. PROBLEM DEFINITION
The policy updating is a difficult issue in attribute-based access control systems,
because once the data owner outsourced data into the cloud, it would not keep a
copy in local systems. When the data owner wants to change the access policy, it
has to transfer the data back to the local site from the cloud, reencrypt the data
under the new access policy, and then move it back to the cloud server. By doing
so, it incurs a high communication overhead and heavy computation burden on
data owners. This motivates us to develop a new method to outsource the task of
policy updating to cloud server.
The grand challenge of outsourcing policy updating to the cloud is to guarantee the
following requirements:
1) Correctness: Users who possess sufficient attributes should still be able to
decrypt the data encrypted under new access policy by running the original
decryption algorithm.
2) Completeness: The policy updating method should be able to update any type of
access policy.
3) Security: The policy updating should not break the security of the access control
system or introduce any new security problems.
4. EXISTING SYSTEM
Attribute-Based Encryption (ABE) has emerged as a promising technique to
ensure the end-to-end data security in cloud storage system. It allows data
owners to define access policies and encrypt the data under the policies, such
that only users whose attributes satisfying these access policies can decrypt
the data.
The policy updating problem has been discussed in key policy structure and
ciphertext-policy structure.
Disadvantages
When more and more organizations and enterprises outsource data into the
cloud, the policy updating becomes a significant issue as data access policies
may be changed dynamically and frequently by data owners. However, this
policy updating issue has not been considered in existing attribute-based
access control schemes.
Key policy structure and ciphertext-policy structure cannot satisfy the
completeness requirement, because they can only delegate key/ciphertext
with a new access policy that should be more restrictive than the previous
policy.
Furthermore, they cannot satisfy the security requirement either.
5. PROPOSED SYSTEM
Focus on solving the policy updating problem in ABE systems, and propose
a secure and verifiable policy updating outsourcing method.
Instead of retrieving and re-encrypting the data, data owners only send
policy updating queries to cloud server, and let cloud server update the
policies of encrypted data directly, which means that cloud server does not
need to decrypt the data before/during the policy updating.
To formulate the policy updating problem in ABE sytems and develop a new
method to outsource the policy updating to the server.
To propose an expressive and efficient data access control scheme for big
data, which enables efficient dynamic policy updating.
To design policy updating algorithms for different types of access policies,
e.g., Boolean Formulas, LSSS Structure and Access Tree.
To propose an efficient and secure policy checking method that enables data
owners to check whether the ciphertexts have been updated correctly by
cloud server.
6. Advantages
This scheme can not only satisfy all the above requirements, but also avoid
the transfer of encrypted data back and forth and minimize the computation
work of data owners by making full use of the previously encrypted data
under old access policies in the cloud.
This method does not require any help of data users, and data owners can
check the correctness of the ciphertext updating by their own secret keys and
checking keys issued by each authority.
This method can also guarantee data owners cannot use their secret keys to
decrypt any ciphertexts encrypted by other data owners, although their secret
keys contain the components associated with all the attributes.
7. SYSTEM ARCHITECTURE:
MODULES:
1. Identity token issuance
2. Policy decomposition
3. Identity token registration
4. Data encryption and uploading
5. Data downloading and decryption
6. Encryption evolution management
8. MODULES DESCRIPTION:
Identity token issuance:
IdPs are trusted third parties that issue identity tokens to Users based on their
identity attributes. It should be noted that IdPs need not be online after they issue
identity tokens. An identity token, denoted by IT has the format{nym, id-tag, c, σ},
where nym is a pseudonym uniquely identifying a User in the system, id-tag is the
name of the identity attribute, c is the Pedersen commitment for the identity
attribute value x and σ is the IdP’s digital signature on nym, id-tag and c.
Policy Decomposition:
In this module, using the policy decomposition algorithm, the Owner decomposes
each ACP into two sub ACPs such that the Owner enforces the minimum number
of attributes to assure confidentiality of data from the Cloud. The algorithm
produces two sets of sub ACPs, ACPB Owner and ACPB Cloud. The Owner
enforces the confidentiality related sub ACPs in ACPB Owner and the Cloud
enforces the remaining sub-ACPs in ACPB Cloud.
Identity Token Registration:
Users register their ITs to obtain secrets in order to later decrypt the data they are
allowed to access. Users register their ITs related to the attribute conditions in
ACC with the Owner, and the rest of the identity tokens related to the attribute
conditions in ACB/ACC with the Cloud using the AB-GKM::SecGen algorithm.
9. When Users register with the Owner, the Owner issues them two set of secrets for
the attribute conditions in ACC that are also present in the sub ACPs in ACPB
Cloud. The Owner keeps one set and gives the other set to the Cloud. Two
different sets are used in order to prevent the Cloud from decrypting the Owner
encrypted data.
Data encryption and uploading:
The Owner encrypts the data based on the sub-ACPs in ACPB Owner and uploads
them along with the corresponding public information tuples to the Cloud. The
Cloud in turn encrypts the data again based on the sub-ACPs in ACPB Cloud. Both
parties execute ABGKM::KeyGen algorithm individually to first generate the
symmetric key, the public information tuple PI and access tree T for each sub
ACP.
Data downloading and decryption:
Users download encrypted data from the Cloud and decrypt twice to access the
data. First, the Cloud generated public information tuple is used to derive the OLE
key and then the Owner generated public information tuple is used to derive the
ILE key using the AB-GKM::KeyDer algorithm. These two keys allow a User to
decrypt a data item only if the User satisfies the original ACP applied to the data
item.
Encryption evolution management:
After the initial encryption is performed, affected data items need to be re-
encrypted with a new symmetric key if credentials are added/removed. Unlike the
SLE approach, when credentials are added or revoked, the Owner does not have to
10. involve. The Cloud generates a new symmetric key and re-encrypts the affected
data items.
SYSTEM CONFIGURATION:-
HARDWARE CONFIGURATION:-
Processor - Pentium –IV
Speed - 1.1 Ghz
RAM - 256 MB(min)
Hard Disk - 20 GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
SOFTWARE CONFIGURATION:-
• Operating system : - Windows XP.
• Coding Language : ASP.NET, C#.Net.
• Data Base : SQL Server 2005
12. DATA FLOW DIAGRAM:
1. The DFD is also called as bubble chart. It is a simple graphical formalism
that can be used to represent a system in terms of input data to the system,
various processing carried out on this data, and the output data is generated
by this system.
2. The data flow diagram (DFD) is one of the most important modeling tools. It
is used to model the system components. These components are the system
process, the data used by the process, an external entity that interacts with
the system and the information flows in the system.
3. DFD shows how the information moves through the system and how it is
modified by a series of transformations. It is a graphical technique that
depicts information flow and the transformations that are applied as data
moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a
system at any level of abstraction. DFD may be partitioned into levels that
represent increasing information flow and functional detail.
15. UML DIAGRAMS
UML stands for Unified Modeling Language. UML is a standardized
general-purpose modeling language in the field of object-oriented software
engineering. The standard is managed, and was created by, the Object
Management Group.
The goal is for UML to become a common language for creating models of
object oriented computer software. In its current form UML is comprised of two
major components: a Meta-model and a notation. In the future, some form of
method or process may also be added to; or associated with, UML.
The Unified Modeling Language is a standard language for specifying,
Visualization, Constructing and documenting the artifacts of software system, as
well as for business modeling and other non-software systems.
The UML represents a collection of best engineering practices that have
proven successful in the modeling of large and complex systems.
The UML is a very important part of developing objects oriented software
and the software development process. The UML uses mostly graphical notations
to express the design of software projects.
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that
they can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core
concepts.
16. 3. Be independent of particular programming languages and development
process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations,
frameworks, patterns and components.
7. Integrate best practices.
17. USE CASE DIAGRAM:
A use case diagram in the Unified Modeling Language (UML) is a type of
behavioral diagram defined by and created from a Use-case analysis. Its purpose is
to present a graphical overview of the functionality provided by a system in terms
of actors, their goals (represented as use cases), and any dependencies between
those use cases. The main purpose of a use case diagram is to show what system
functions are performed for which actor. Roles of the actors in the system can be
depicted.
19. CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modeling Language
(UML) is a type of static structure diagram that describes the structure of a system
by showing the system's classes, their attributes, operations (or methods), and the
relationships among the classes. It explains which class contains information.
User
View Files
view Transactions
view & edit profile
file download
file download()
Data owner
view files
view & edit profile
file upload
view transaction
create file access()
file upload()
create sub domain()
Admin
create cloud server
create data owner
create domain
creat sub domain
view & edit profile
view & edit dataowner profile
create data owner()
create domain()
create sub domain()
20. SEQUENCE DIAGRAM:
A sequence diagram in Unified Modeling Language (UML) is a kind of interaction
diagram that shows how processes operate with one another and in what order. It is
a construct of a Message Sequence Chart. Sequence diagrams are sometimes called
event diagrams, event scenarios, and timing diagrams.
21. User Admin Owner
Database
Upload Files
Verify Owner Files
Edit profile
Edit owner and admin profile
View Owner Detalils & Owner Files
file download
create owner
create domain & sub domain
view User details
File access control
File request
create cloud server
file response
22. ACTIVITY DIAGRAM:
Activity diagrams are graphical representations of workflows of stepwise activities
and actions with support for choice, iteration and concurrency. In the Unified
Modeling Language, activity diagrams can be used to describe the business and
operational step-by-step workflows of components in a system. An activity
diagram shows the overall flow of control.