Thursday, March 3, 2016

Structured Data vs Unstructured Data




source: www.sherpasoftware.com/blog/structured-and-unstructured-data-what-is-it/
Structured Data is data that can be organize and displayed in a table with columns and row which can be handled and refined by data visualization tools, we can say that it is almost always in the text file form. Unstructured data is data that can be proprietary to another entity and which may not have an identifiable internal structure. The content of the data can be assorted in clusters that are of no value until a process can identify and stored them in an organized fashion. For this process there exist specialized software that search items in the data and categorized them, to be able do have a “structured set” of the unstructured data.

There are two main ways to determine if a data set is structured or unstructured, in addition to the definition of each type, in example, the data explicitly has or dos not have a structured. The first one is that the data has some form of structure but has not been formally identify, but you can indirectly derived a form of identifying the structured, therefore this data should not be labeled as unstructured. Another form is that if the data is structured but you can’t derived an analysis out of it, then you can consider the data unstructured.

Unstructured data can be present in different places such as Emails, Word Processing Files, PDF files, Spreadsheets, Digital Images, Video, Audio, Social Media Posts and many other. This kind of “documents” or where you can find unstructured data have in their content what is called Rich Data. Rich data includes from pictures, video, voice, x-rays, power point presentations, gps locations, “check-ins”, people on a certain picture in a certain place etc…

While rich data types provide a remarkable opportunity to different edges on analysis over text alone, they do so at the expense of storage space. Rich media types are not just slightly larger that basic text, they can be orders of magnitude larger.


To help with the issue, associations have swung to various distinctive programming arrangements intended to look unstructured data and extract important data. The essential advantage of these instruments is the capacity to gather worthy data that can offer a business some assistance with succeeding in an aggressive and competitive environment. Since the volume of unstructured data is developing so quickly, numerous enterprises likewise swing to technological answers such as software and hardware that offers them some assistance with solutions to help them better manage and store  unstructured data. These can incorporate equipment or programming arrangements that empower them to make the most effective utilization of their accessible storage room. In the following figure you can see what tools are out there and why are the organizations are using it.


source: http://www.webopedia.com/TERM/U/unstructured_data.html


The use of Unstructured data  does not infer that Big data innovations, will supplant traditional data warehouses. Rather, they will exist in parallel. The traditional data warehouse will even now assume a fundamental part in the business. Financial analysis and different applications connected with the DW will still be important, and the DW itself will be a source of some of the data used in big data projects and will probably receive data from the results of advanced analysis projects. So Big data is a development as opposed to a substitution for DW and data engines. Therefore I think in the future the visualization of data in going to collect and analyze information from a much bigger spectrum and because of this become a tool to rich analysis.

No comments:

Post a Comment