Posts

Showing posts from August, 2022

Welcome!

  I want to thank you for becoming a TISD contributing member of my blog. You may have not known exactly what to expect when you subscribed. Perhaps you read something you liked so you decided to give it a try. I’d like to give you a clearer sense of what I’m trying to do, why I think it matters, and how to get more out of it. Soon, someone will publish on TISD to share an experience, an idea, or a perspective with the online world. Some will be professional writers representing the technical community. As momentum builds over time, I hope to hear from fellow doctoral students, or students just beginning their college journey. Others will be simply publishing of their own accord. One might even be you, writing about what matters to you. Each has a chance to influence others, express their ideas and concepts, initiate a topic or project, and perhaps even start an innovative campaign. I hope to encourage these writers to find the audience they deserve, using a community of crit...

Caveats of Supervised Learning and Big Data

Image
Data mining and analytics are used to test hypotheses and detect trends from large data sets. In statistics, the significance is determined to some extent by the sample size. By facilitating process optimization, boosting insight discovery, and enhancing decision making, the big data revolution has the potential to revolutionize our daily lives, our workplaces, and our ways of thinking. Learning patterns from data to enhance performance on different tasks is the goal of machine learning, a subfield of computer science. Machine learning's capacity to learn from data and give data-driven insights, judgments, and predictions is central to data analytics and crucial to realizing this enormous promise. While this new context is ideal for some machine learning tasks, traditional methodologies were designed in a previous era and, as a result, are predicated on several assumptions, such as the data set fitting totally in memory. The inconsistencies between these assumptions and the realiti...

Quantitative Research and Numeric Data

 By Nathan B. Smith The appropriate selection of a research technique is critical for sound scientific research and is determined mainly by matching research goals to the features of various research procedures. Given that researchers in economics, business, and technology must pick from various methodologies and approaches, selecting an acceptable research methodology that supports cross-disciplinary study is one of the most challenging choices a researcher must make. As a result, it is critical to consider the trade-offs inherent in qualitative and quantitative research methods. As a scholar-practitioner, one must examine strategic management theory to choose an appropriate research approach for identifying and assessing critical components and phases. Discussion Scenario This paper presents a case study of a veteran-owned, small business called TechCoaches Information Systems Design that provides technical documentation and spare parts cataloging services to significant aerospac...

Big Data Analytics, Frameworks, Applied Statistics, and Tools for Bioinformatics

By Nathan B. Smith Over the past few years, the amount of genetic data that is freely accessible to the public has significantly increased. This phenomenon coincides with a dramatic decline in the cost of genome sequencing. Recent research and cohorts have developed massive datasets including more than 100,000 persons. These datasets have been analyzed simultaneously to extract genetic variation across populations. As a result, massive volumes of data about variance have been produced for each cohort. Discussion Genomic medicine uses a patient's genetic information to develop tailored approaches to diagnostic or therapeutic decision-making. This concept is called "personalized medicine." By analyzing data on a massive scale and using a variety of data sets, a technique known as "big data analytics" can unearth previously unseen patterns, correlations, and other insights. Integration and manipulation of diverse genomic data, as well as comprehensive electronic he...

Correlation

By Nathan B. Smith Correlation is a method for determining how two variables are related in a dataset. Regression is the study of how one variable influences another. In correlation analysis, the two variables are symmetrically considered, whereas, in regression analysis, one is assumed to be unsymmetrically dependent on the other (Lindley, 1990). Discussion Part 1 Leadership at a large corporation states that they want to identify the influence of years of service on workers' productivity levels and anticipate future productivity based on the length of experience in years. The corporation has statistics on all its personnel and assesses each associate's productivity using a reasonable measure. This type of inquiry and analysis is common in business administration; various ways address these questions. To determine if a variable (X) (in this case, years of experience) is beneficial in forecasting another variable (Y) (level of productivity), a significance test can be used. Thi...

Descriptive Analysis: Central Tendency and Variance

By Nathan B. Smith Quantitative research makes heavy use of descriptive statistics. Descriptive statistics can be subdivided into evaluating central tendency measures and variability (or spread). Central tendency (or center measures) include mean, median, and mode. If the dataset under review is perfectly distributed, the mean, median, and mode will be equal. The spread measures indicate how close or far apart data observations are. The measure of spread is typically determined with respect to the measure of center that best characterizes the dataset and includes standard deviation, variance, minimum and maximum values, skewness, and kurtosis. Discussion Independent and dependent variables When conducting a scientific experiment, a researcher must identify two types of primary variables: the independent and the dependent variables. The independent variable is manipulated, controlled, or changed to study the effects of the dependent variable. On the other hand, the dependent variable is...

Exploring Strategies for Developing a Big Data Analytics Processing Pipeline for Aerospace Aftermarket Technical Service Organizations: A Qualitative Study

By Nathan B. Smith Research Prospectus   Abstract This study addresses the strategies needed to establish a technical writing analytics application. It is envisioned that an exploratory qualitative research design approach will be followed to explore emerging technologies, identify insights, and formulate strategies. Participants will include data scientists, big data engineers, technical engineering writers working in the aviation industry, information technology managers, and commercial aviation aftermarket support experts from various maintenance, repair, and overhaul organizations, airframer companies, and military aviation intermediate maintenance departments (AIMD). The theoretical framework and constructs are intended to align with big data analytics, data mining, business intelligence (BI), machine learning (ML), and artificial intelligence (AI) with a focus on current technological advancements in big data studies (Lakshmanan, Robinson, & Munn, 2021).  Data will be col...

Numeric Data: Comparison of NoSQL Databases

By Nathan B. Smith NoSQL databases were not designed for the same objectives as SQL databases. While SQL was designed to tackle the storage challenges of big, structured datasets, NoSQL solutions were established to solve the storage problems of massive unstructured datasets. Both systems have advantages and downsides. NoSQL databases provide several benefits when storing, processing, and querying large amounts of data. NoSQL databases are well-suited to high-volume state changes per second and millions of concurrent, dispersed users. In contrast to the norm, NoSQL databases, on the other hand, provide superior sharding and real-time access. Data replication at a reduced cost while maximizing system resources is possible. Relational databases, on the other hand, are excellent for structured data, complicated queries, secure transactions, and a high level of integrity (Redmond & Wilson, 2012).  Because NoSQL solutions are still evolving, they may not be fully mature. Big data en...

Descriptive Analysis: Central Tendency and Variance

By Nathan B. Smith Quantitative research makes heavy use of descriptive statistics. Descriptive statistics can be subdivided into evaluating central tendency measures and variability (or spread). Central tendency (or center measures) include mean median and mode. If the dataset under review is perfectly distributed, the mean, median, and mode will be equal. The spread measures indicated how close or far apart data observations are. The measure of spread is typically determined with respect to the measure of center that best characterizes the dataset and includes standard deviation, variance, minimum and maximum values, skewness, and kurtosis. Discussion Independent and dependent variables When conducting a scientific experiment, the investigator must identify two primary variables: the independent and the dependent. The independent variable is the variable that is manipulated, controlled, or changed to study the effects of the dependent variable. On the other hand, the dependent variab...

Parametric and Nonparametric Analysis

By Nathan B. Smith Numerous instances occur when data acquired from an organization do not meet the standards of parametric analysis. A practitioner could not do a t- or F-test on the data (ANOVA). Professional practitioners and academics should be conversant about these tests, including the chi-square test, the Mann–Whitney U test, the Wilcoxon signed-rank test, and the Kruskal–Wallis one-way analysis of variance. There are circumstances in which each of these tests is appropriate. It is essential to know when to use each, what each test is used for, and why it is preferred over t-tests and ANOVA (Huck, 2012).  Discussion The t-test is used to assess if two populations are statistically distinct, while ANOVA (F-test) is used to discover if three or more populations are statistically distinct. Both tests examine the difference in means and the spread of distributions (such as variance) across groups; however, their methods for determining statistical significance are distinct. Thes...

Motion Analytics: A Case Study

By Nathan B. Smith Motion analysis is becoming easier for the typical organization thanks to machine learning, which generates new use cases and applications that might provide value. To collect high-quality movement data, which enables motion analytics like those used to analyze elite athletes or employ sophisticated sensors that require a room full of equipment used to be necessary. However, things are beginning to shift. Complicated machine learning algorithms are beginning to harvest the information that complex technological instruments like lidar and specialized sensors previously offered to algorithms. Instead of needing a specific sensor vest or dedicated space, developers are discovering methods to correctly record things like yoga alignment using the camera inherent in contemporary smartphones (Lawton, 2020). Discussion Sports performance and pathologic gait are analyzed using biomechanical gait analysis. Motion capturing technologies, research methods, and data processing ap...