Unformatted Attachment Preview
Author One
Type and
purpose
of study
Type means
qualitative,
quantitative, or
mixed methods
research.
Hypothesis
or Research
Questions
Both quantitative
and qualitative
research can have
research questions,
but only
quantitative can
have hypotheses.
Author Two
Author Three
Author Four
Author Five
Population
and
Sample
Methodology
Examples are case
study, grounded
theory,
ethnography, quasiexperimental design
etc.
Findings
We call it findings
in qualitative
research and results
in quantitative
research.
Evaluation
notes
Look for the
limitations to the
study. Small sample
size, not
generalizable, bias
of researcher etc.
How will the study
help your research
or why are you
rejecting it?
Stern’s (2015) Note Taking Table
State of Practice for Language and
Literacy Research: A Review of
Methods
Carla Wood
Autumn McIlraith
Lisa Fitton
Florida State University, Tallahassee
T
he interests of scientists from a variety
of disciplines overlap at the focal point
of literacy development and intervention.
Literacy has been targeted nationwide as a high priority area for external funding to promote improved
outcomes in reading, understanding, and subsequent
academic progress consistent with the No Child Left
Behind Act (NCLB, 2003). This area of research
is relevant to multiple fields such as psychology,
education, special education, and speech-language
pathology.
ABSTRACT: Purpose: In an effort to build capacity of
future doctoral leaders in speech-language pathology,
this review examined journals relevant to language
and literacy research for trends in the size of data
sets and the use of statistical analyses, randomization, and replication.
Method: A systematic review of empirical studies was
conducted. Investigators examined 120 randomly selected scholarly articles published in 2013 and 2014
in 10 journals, including 4 focused on communication disorders and 6 with relevancy to the study of
language and literacy.
Results: Based on trends in the randomly selected
sample of 120 journal articles, data sets varied largely
in terms of size. Random assignment was used 22%
306
Given that a variety of disciplines conduct literacy research, a wide distribution of methods used
in language and literacy research is to be expected.
This diversity of methods is beneficial in that it
provides a variety of angles from which to examine
current theory and practice, which in turn strengthens our confidence in findings that are consistently
replicated across different methods. However, in
order to continue to contribute to the body of research, scholars must have a basic understanding
of the wide range of methods available in order to
of the time. Studies used a wide variety of types of
designs and statistical analyses. Larger data sets and
statistical methods using multiple levels of analysis
were more prevalent in journals that were not specifically within the field of speech-language pathology.
Conclusion: The findings of this systematic review support the need to prepare future scholars
to employ rigorous methods and analyses in their
research. Potential ways to enhance the infrastructure required for utilization of innovative statistical
advances, large-scale data sets, experimental design,
and replication are discussed.
KEY WORDS: systematic review, speech-language
pathology, reading, literacy
Contemporary Issues
in C ommunication
Science andScience
Disorders
• Volume
43 • 306–317
• Fall• Fall
20162016
Contemporary
Issues in Communication
and D isorders
• Volume
43 • 306–317
1092-5171/16/4302-0306
© NSSLHA
recognize their influence on research findings and
their implications for interpretation. In response to
this need, we reviewed the current state of practice in
language and literacy research in an effort to identify
methodologies that are important for next-generation
scholars who wish to continue to consume and produce research effectively.
Research evidence is considered one of the pillars
of evidence-based practice (EBP) and is critical to
clinical practice (American Speech-Language-Hearing
Association [ASHA], 2005a; McCurtin & Roddam,
2012). EBP in speech-language pathology values both
internal and external evidence, including consulting
the published literature for the best available scientific evidence to support the use or disuse of specific
speech and language intervention approaches. Among
valued skills, speech-language pathologists (SLPs)
are expected to demonstrate knowledge of various research practices and integration of research principles
into EBP (ASHA, 2005b).
Although practitioners and researchers alike
greatly value EBP, there is no general consensus on
what constitutes best available scientific evidence and
which methods and analyses are necessary or appropriate for future consumers and producers of research
in our field. By definition, rigor is used in research
to refer to strict precision; however, the criterion for
acceptable precision may vary across fields. Varied
standards and hierarchies are applied within and
across disciplines (McCurtin & Roddam, 2012), but
the focus of the present article is to examine methodology in language and literacy research from a
perspective of pursuing diversity.
Different methods of research have different
strengths and weaknesses that influence their utility to scholars. However, there is a need to address
the suggestion that research in some fields, such
as speech-language pathology, is not broadening to
include newer advanced methods and statistics (e.g.,
Ioannidis, 2005). Given the national movement toward large data sets, randomization, and large-scale
replication (Ioannidis, 2005), it is critical to examine the use of these methods within and outside of
speech-language pathology research.
Ioannidis (2005), a widely acclaimed researcher,
has called for a revolution in research practices,
with a focus on replication by independent research
teams with large data sets (Ebrahim et al., 2014).
After extensive review of the research literature in
medicine, Ioannidis made a convincing case for the
need for better powered evidence from large studies.
He concluded that small n studies without random
assignment were underpowered and were at risk for
leading practitioners to inaccurate conclusions. Ioannidis exposed a plethora of research findings with
large effect sizes that could not be replicated, resulting in a call for more rigorous research methods and
a focus on replication.
In addition to the use of larger data sets, the
educational research community has increasingly incorporated statistics within the families of hierarchical linear modeling (HLM) and structural equations
modeling (SEM). These techniques have expanded researchers’ abilities to answer more complex research
questions through simultaneously examining interactions between predictor variables and their impacts
on outcome variables. Recent research findings have
suggested that the unique characteristics of individual
participants interact considerably to influence observed results. For example, there have been repeated
calls in bilingual literacy research to report more
background characteristics such as age of first exposure to languages and type of classroom instruction
because these characteristics have been found to be
critically influential (e.g., August & Shanahan, 2006;
Branum-Martin, Tao, & Garnaat, 2015). Although
more traditional statistical models, including analyses
of covariance (ANCOVAs) and multiple regression,
allow for statistical inclusion of these background
characteristics, they are limited in their ability to account for the interactions between variables. Rather,
relationships between included variables are often
evaluated individually in the order dictated by the
researcher. This is appropriate when there is a strong
theoretical basis for the chosen order, but poses a
challenge when an established theoretical foundation
for the research is lacking. In the latter case, it may
be appropriate to consider statistical models such as
HLM and SEM, which offer the benefit of examining relationships between all variables simultaneously
(see Kline, 2015).
An additional advantage of multilevel techniques
such as HLM is that they allow the statistical model
to more accurately represent the structure of the data.
For example, if children are presented with a set of
words and are asked to read them, the individual
responses to each item by each child are not independent. All responses from a single child are dependent
in the sense that they originate from the same child,
and certain child-related features potentially influence
the responses of that child, such as the child’s previous exposure to those words or the child’s phonological awareness skills. All responses to a single word
are similarly dependent, because word-related features
such as regularity and frequency of occurrence may
influence the accuracy of responses across all children. Using HLM, this dependence can be taken into
account by “nesting” the individual responses within
children and within items. Without accounting for
this nesting in HLM or SEM to include both levels
Wood et al.: Language and Literacy Research
307
of features in the same model, statistical analysis
may inaccurately estimate effects attributable to each
independent variable, resulting in inaccurate conclusions (Baayen, Davidson, & Bates, 2008). Multilevel
modeling provides an excellent tool for researchers
to use to scale statistical model complexity to match
the structure of more complex data sets, resulting in
more accurate estimates of effects (Compton, Miller,
Gilbert, & Steacy, 2013).
from low-incidence populations, presents challenges.
Further, within the populations we research, it is difficult to employ true experimental designs with random
assignment given that participant characteristics cannot
be assigned and treatment often cannot be ethically
withheld. Low-incidence populations of interest can
also have high degrees of heterogeneity. This is exemplified in the area of aural rehabilitation for individuals
who are deaf or hard of hearing. Among other variables, participants in such research commonly present
with differences in age of onset, age of identification,
underlying cause, severity, use of sensory device,
comorbidity, communication method, and educational
settings or approaches. Such an array of complex characteristics, coupled with the low incidence of deafness,
presents methodological challenges for research in
aural rehabilitation techniques, and many other specializations face similar complexities
Perhaps equally as powerful as these methodological challenges is the current lack of infrastructure that
would be required to support and sustain change or a
revolution in methodology. Sharpe (2013) identified
several infrastructure components that are necessary to
support the inclusion of more complex methods in research practices, including (a) better access to continuing education in advanced methods and statistics, and
(b) better use of mavens in the field. With regard to
continuing education, SLPs and researchers in our field
readily access continuing education in content across
big areas (e.g., language, speech, hearing, fluency,
voice, social communication, communication modalities, and cognition). However, continuing education
regarding methodological or statistical innovations for
consumers or producers of research is less available.
One might argue that this level of continuing education
is not feasible given that our field has multiple areas
of content specialty to master. Sharpe also raised the
notion of statistical mavens, or individuals who serve
as liaisons between statistical innovators and contentarea experts. The presence of mavens could serve to
provide practitioners and researchers with the information and new skills needed to be effective consumers
and producers of advanced methods research. Although
a promising concept, using mavens to enhance research
design and analyses has received limited attention in
the field of speech-language pathology.
Without access to large data sets or infrastructure
for continuing education on innovative research methodologies, research in speech-language pathology may
be at risk for being “left behind” as new methods
and statistical practices are developed and adopted by
other fields in order to support more robust, generalizable research. Our resistance to including innovative
techniques in our statistical toolboxes could come at
great cost as external funding becomes increasingly
Modeling Revolution
The rise in the use of multilevel modeling in many
academic fields has led some researchers to refer
to a “modeling revolution” (Rodgers, 2010). The
term modeling is used to refer to “a set of assumptions together with implications drawn from them by
mathematical reasoning” (Neimark & Estes, 1967, p.
v). Evidence for an increase in the use of multilevel
modeling was demonstrated by Reinhart, Haring,
Levin, Patall, and Robinson (2013), who examined
methods used in 275 empirical articles from five primary research journals that were published between
2000 and 2010. They found that the use of modeling increased from 15% of empirical studies in 2000
(nine out of 61) to 54% in 2010 (50 out of 93).
Challenges of Advanced
Modeling and Methodology
Despite the development of innovative statistical techniques and Ioannidis’s (2005) call for larger sample
sizes, exclusive reliance on traditional statistical
practices persists among many scholars. Considering
the abundant support in place for traditional systems
and the lack of support and resources available for
novel procedures, the adoption of new practices is
difficult. In the field of speech-language pathology
in particular, there are several notable challenges
restricting the incorporation of newer and more complex methodologies into our research practice. A few
barriers that will be discussed throughout the article
include the prevalence of small n studies due to the
low incidence of disabilities of interest, challenges
to random assignment due to the inherent participant
characteristic of presenting with a disability or not
(e.g., researchers cannot randomly assign cochlear
implants), and (c) ethical conflicts of random assignment to a treatment versus comparison control group
when early intervention is warranted.
The prevalence of small n studies has been
perpetuated by several factors. In speech-language
pathology, the tendency toward small data sets may
be partially attributed to our interest in low-incidence
populations. Recruiting children and youth, particularly
308
Contemporary Issues
in
Communication Science
and
Disorders • Volume 43 • 306–317 • Fall 2016
competitive and as related fields continue to publish
research that was conducted using these complex
methods. Researchers with access to large data sets
and those who can leverage the most appropriate
designs and employ random assignment have the upper hand in competing for external funding. When we
consider the disorders that are studied within speechlanguage pathology, the heterogeneity of affected
individuals, and the number of variables we know to
be important, it is critical that we expand our methodological skills to include approaches that recognize
and account for these levels of complexity.
Given the importance of rigor and of methodological diversity in research, we decided to examine
the literature in language and literacy-related journals
in order to foster discussion of the range and distribution of research methodologies that are currently
being used. We decided to examine and describe
general characteristics (e.g., number of participants,
number of research institutions involved, population
studied, methods, statistical analyses, and funding) of
2013–2014 publications in 10 journals related to child
language and literacy based on a review of a subset
of randomly selected articles. Specifically, we asked
the following research questions:
• What were the average sample sizes and general
characteristics of studies published in 2013–2014
in 10 journals that are recognized as publishing language and literacy research (based on a
randomly selected sample of 120 articles)?
• What was the proportion of different types of
analyses (e.g., qualitative; single-case; nonparametric; ordinary least squares [OLS] or linear
least squares; and advanced statistics, such as
SEM, HLM, item response theory) in 2013–2014
publications in 10 journals related to child language and literacy (based on a randomly selected
sample of 120 articles)?
• To what extent was random assignment and
replication used in publications in the sample of
2013–2014 articles in 10 journals?
Finally, a secondary interest of the current research (although not a specific research question) was
to discuss the possible implications of these findings
for current practices for scholars in higher education
programs in communication disorders and specifically
for those with interests in language and literacy.
Method
Journals of Interest
We selected 10 journals to review, including four
journals of speech-language pathology: Language,
Speech, and Hearing Services in Schools (LSHSS);
American Journal of Speech-Language Pathology
(AJSLP); Journal of Speech, Language, and Hearing Research (JSLHR); and Journal of Communication Disorders. Six additional journals relevant to
language and literacy were selected for inclusion due
to their relevance to the aim of exploring journals
that publish language and literacy research: Journal
of Education Psychology (JEP), Scientific Study of
Reading (SSR), Reading Research Quarterly (RRQ),
Reading and Writing, Journal of Learning Disabilities, and Journal of Research for Educational Effectiveness (JREE).
LSHSS. LSHSS is a quarterly journal that is
produced by ASHA. The journal focuses on research
that is relevant to SLPs and audiologists in schools.
The journal is designed to address the needs of researchers, clinicians, and students who are interested
in school-based issues. In 2012, LSHSS reported an
impact factor of 1.256 and a 5-year impact factor
of 1.520. In 2013, the journal published four issues
containing a total of 26 research articles and in 2014,
17 research articles, reporting research on typically
developing children and children with communication
disorders ranging in age from toddlers to high-school
students and adults who serve as SLPs.
AJSLP. AJSLP is a quarterly journal that is also
produced by ASHA. It is designed to disseminate
research findings applicable to clinical practice in
speech-language pathology. In 2012, AJSLP reported
an impact factor of 2.448 and a 5-year impact factor
of 2.897. Overlapping in school-based topics with
LSHSS, AJSLP publishes on all aspects of clinical
practice, not restricted to school-based topics. In
2013, the journal published four issues containing 27
research articles, and in 2014, published four issues
containing 50 research articles.
JSLHR. JSLHR is a bimonthly journal that is
also produced by ASHA. It is designed to disseminate
basic and applied research that focuses on normal and
disordered communication processes. The mission of
JSLHR focuses on advancing evidence-based practices
as well as providing new information and theoretical
approaches that are relevant to speech, language, and
hearing processes, assessment, and management. In
2012, JSLHR reported an impact factor of 1.971 and
a 5-year impact factor of 2.745. In 2013, the journal
published six issues containing 149 articles, and in
2014, published six issues containing 171 articles.
Journal of Communication Disorders. The Journal of Communication Disorders, published six times
a year, disseminates articles with a focus on disorders
of speech, language, and hearing. Although the journal does not exclusively publish language and literacy
research, the assessment, diagnosis, and treatment of
Wood et al.: Language and Literacy Research
309
reading disorders is a focus of the journal in addition
to the reading-related implications of other communication disorders. The Journal of Communication
Disorders reported an impact factor of 1.278 in 2015
and a 5-year impact factor of 1.864. In 2013, this
journal published one volume (46) with six issues
containing a total of 38 research articles; in 2014, the
journal published six volumes (47–52) containing a
total of 40 articles.
Journal of Learning Disabilities. The Journal of
Learning Disabilities, a journal of the Hammill Institute on Disabilities by Sage, publishes six issues per
year, with articles on the science of learning disabilities. The journal reports an impact factor of 1.901
and a ranking of 4 out of 39 in special education.
In 2013, this journal produced one volume (46) with
six issues containing 44 research articles. In 2014,
the journal produced one volume (47) with six issues
containing 43 research articles.
JREE. JREE is a quarterly publication of the
Society for Research on Educational Effectiveness.
Among the journal’s aims are to disseminate findings
of intervention and evaluation studies or methodological studies that focus on the process and implementation of educational research, specifically related to
problems in school classrooms. It was reported to
have an impact factor of 3.154. In 2013, the journal
published 15 research articles. In 2014, the journal published four issues with 16 research articles,
including a special issue (number 3) on learning
disabilities research studies reporting findings from
projects funded by the National Institute of Child
Health and Human Development. The introduction to
a special issue and two commentaries were excluded
from review.
JEP. JEP is a quarterly journal designed to
disseminate research pertaining to education across
the lifespan, from early childhood to geriatrics. Not
exclusively focused on language and literacy research, the journal includes general research in the
area of educational psychology. JEP identifies several
key focus areas, including scholarship on learning,
cognition, instruction, motivation, social issues, emotion, development, special populations, and individual
differences. In 2012, JEP reported an impact factor
of 3.08 and a 5-year impact factor of 4.93. In 2013,
the journal published 81 research articles; in 2014, it
published 79 research articles.
SSR. SSR is a bimonthly journal that is produced
by the Society for the Scientific Study of Reading. It
focuses on empirical studies related to language and
literacy, although it also accepts literature reviews,
papers on theory and constructs, and policy papers.
The journal indicates that it places value on both
theoretical and practical significance. SSR reports a
310
Contemporary Issues
in
Communication Science
5-year impact factor of 3.124. In 2013, the journal
published 26 research articles; in 2014, it published
21 research articles.
RRQ. RRQ is a quarterly journal that is produced by the International Reading Association. It is
designed to facilitate connections between researchers in an effort to build a knowledge base in reading
and literacy. The journal identifies empirical studies;
multidisciplinary research; various modes of investigation; and diverse perspectives on teaching, practices, and learning as valued areas. RRQ reports an
impact factor of 2.382. In 2013, the journal published
21 research articles; in 2014, it published 22 research
articles.
Reading and Writing: An Interdisciplinary
Journal. Reading and Writing is a quarterly journal
published by Springer with a primary aim of disseminating scientific articles related to the process,
acquisition, and loss of reading and writing skills.
The journal description highlights that the focus of
the journal spans several disciplines including neuropsychology, cognitive psychology, speech and hearing science, and education. Based on data from 2001
to 2005, the impact factor is reported to be 3.85. In
2013, this journal published nine issues containing
a total of 69 articles. In 2014, the journal published
nine issues containing 80 articles, 79 of which were
research articles and one article that was an introduction to a special issue.
Procedure
In approaching the task of identifying and describing research designs and analyses used in each study,
we first had to establish definitions of relevant terms
across disciplines. For this review, we included only
research studies and excluded general literature reviews, editorials, commentaries, tutorials, and position papers. We categorized studies by design using
broad categories (refer to the Appendix for definitions and categorizations). Additionally, we identified
basic characteristics such as number of participants,
age range, number of research institutions involved,
method(s) of analyses, presence of random assignment, and inclusion of replication.
We pulled a random sample of 12 articles from
each journal, six from each of 2 years—2013 and
2014. Although six is a notably small proportion of
articles for some of the journals, the quantity of six
was selected in part because a few of the journals
had a relatively small corpus of articles published in
a given year (e.g., 17 articles in total for LSHSS and
15 for JREE in 2013). Albeit an arbitrary number, 12
was equivalent to bimonthly distribution. To complete
this task, we entered each research article in a journal
and
Disorders • Volume 43 • 306–317 • Fall 2016
on a line in an Excel database, organized by year and
journal title. Initially, we considered excluding articles
that did not explicitly discuss literacy implications;
however, upon further consideration, it was apparent that a case could be made that literacy components and implications of disorders are quite vast. For
example, an article on phonetic processing during the
acquisition of new words by children with cochlear
implants may not explicitly discuss literacy, yet the
research has implicit relevance to phonological awareness. As a result, we used an inclusive approach such
that every research article in the relevant journals had
an equal chance of being selected for review.
We entered the range of numbers associated with
the rows of the Excel database into a random number generator (i.e., select six numbers between 1 and
132). The random numbers generated served to identify which articles would be reviewed for the journal.
Six articles were selected in this manner for each
year of the journal. In the event that the selected
article did not qualify for the study (e.g., a literature review or editorial without data), the article was
excluded and the random number generator was used
to derive a new line number corresponding to another
article. This occurred in three instances for JREE;
twice for RRQ; and once for AJSLP, JLD, Journal of
Communication Disorders, and SSR.
Each author independently coded four articles
from each of the first six journals in an Excel database using mutually agreed-on criteria. A month into
the process, we met to discuss the parameters for
categorization and to adjust and refine the definitions
for distinction of categories. The rules of coding were
further defined to clarify the process based on our
initial reviews. We agreed that in the event that there
were multiple studies or experiments within the same
scholarly article, the study with the most advanced
statistical design would be included, and if the studies had the same design, which was the case in all
instances, then the study with the largest number
of participants would be included in the review. Of
the articles reviewed, nine articles included multiple
research studies within the scholarly article.
Categorizing statistical analyses. We coded each
article with one of four categorical codes based on
the type of research analysis included in it. The
codes included four types: qualitative research,
single-case methods or a case study, traditional estimation, or advanced statistics. Qualitative studies
included research that employed qualitative analyses
such as open-ended interviewing to describe and
develop themes. The single-case code was assigned
to studies that employed single-case design methods
(e.g., A-B-A withdrawal, multiple baseline, or alternating treatments; Horner et al., 2005) or reported
a case study. Traditional estimation included singlelevel analyses such as OLS approaches (e.g., analysis
of variance [ANOVA], regression, t tests), which had
single variance terms including correlations, descriptive statistics, and nonparametric analyses. Advanced
statistics included methods that employed multiple
levels of analysis (e.g., SEM or HLM) or considered
multiple dimensions of sampling (e.g., subjects and
time or subjects and items) such as growth curves
and mixed models (Garson, 2013).
During the review process, we identified that
several of the randomly selected studies used a metaanalysis. As a result, meta-analysis was added as an
additional category. In addition to the assignment of
categorical types, we also identified specific analyses
used by name (e.g., hierarchical regression) to further
describe the statistical methodology used in a study.
Randomization was coded with a binary code for
the presence or absence of random sampling. This
category was included because of the recent focus on
randomization (Ioannidis, 2005); however, we recognized that the use of randomization may be less
feasible or practical for studies examining individuals
with communication disorders. Notably, randomization
is not always possible in clinically relevant research
in speech-language pathology. For example, researchers cannot randomize who has Down syndrome and
who is typically developing, or which participants receive a cochlear implant at an early age versus those
receiving treatment B (e.g., hearing aids).
Agreement. Of the articles reviewed, 14 were
randomly selected to assess interrater agreement.
Agreement was calculated by dividing the number of
agreements by the total (disagreements plus agreements) × 100. There was 100% agreement for the
type of design (e.g., qualitative, single-case design/
case study, traditional, or advanced) across coders for
the six journals. For other data elements (e.g., number of participants, random assignment), there were
three cases of confusion where one aspect of interest (e.g., replication) was not explicitly stated in the
article. All instances of disagreement were discussed
until we reached consensus.
Results
Following completion of the review process and agreement consensus gathering, we conducted descriptive
data analyses to identify trends in designs and analyses
in order to address the research aims, which included
describing (a) the average sample size and general
characteristics, (b) the proportion of different types of
statistical analyses used, and (c) the extent to which
random assignment and replication was used.
Wood et al.: Language and Literacy Research
311
General Characteristics
The population of interest and the characteristics of the data set also varied by population type
(Table 2). The articles in the journals either focused
on typical populations exclusively (60%), focused on
participants with atypical development or disorders
(20%), or included both groups of participants with
typical and atypical characteristics (20%). The studies
largely reported on children in grades K–12 (57%),
but the age group of focus was somewhat distributed:
infant-toddler (9%), preschool (6%), college attendees (9%), and adults (17%). A large portion of the
studies reported that the research was grant funded
(65%), although some authors may not have reported
grant funding as it did not appear to be a customary
practice for all of the journals
To address the first research aim, we aggregated data
across the 12 articles selected from each journal and
across the complete set of 120 articles. On average,
the studies in the articles showed wide variability in
the sample size, with a mean of 973 (SD = 4282).
The extreme ends of the range were typically large
n survey studies (n = 5) and single-case studies (n =
5). When the five survey studies were excluded from
the average number of participants, the adjusted mean
sample size was 699 (SD = 3,313).
On average, the sample studies were conducted
by two research institutes, with a range of one to five
institutes. Large standard deviations are partially explained by the fact that the average sample size and
number of research institutes varied across journals.
Table 1 reflects detailed descriptive data by journal.
The size of the participant pool tended to be smaller
for journals that focused on populations with communication disorders (e.g., LSHSS, AJSLP, JSLHR, JCD)
where the average number of participants ranged
from 27 to 108. Journals that tended to include
populations with typical development (e.g., JEP, SSR,
RRQ, JREE) tended to have a larger average number of participants, with means ranging from 140 to
6,241. Three journals, Journal of Learning Disabilities, Reading and Writing, and JREE, appeared to be
exceptions to that trend. These three journals included
populations with learning disabilities and reading
disorders but also had large mean sample sizes (215,
139, and 2,765, respectively).
Types of Analyses
The 120 articles reviewed used six types of design,
including qualitative (n = 4), single-case design or
case study (n = 7), traditional quantitative (n = 69),
advanced statistics (n = 37), meta-analysis (n = 2),
and one simulation that was not specified in our
original coding scheme. The prevalence of advanced
statistics varied considerably across journals and
particularly between journals that focused on communication disorders and other journals (refer to Table
3). Based on the descriptive data in Table 3, six of
the 10 journals employed advanced statistics on 25%
or more of the randomly selected articles (RRQ, RW,
JLD, JREE, SSR, and JEP). Journals that focused
more exclusively on participants with communica-
Table 1. Summary of data set size based on 120 randomly selected articles reviewed.
Universities
Journal
M
SD
LSHSS
AJSLP
JSLHR
JCD
JLD
JREE
JEP
SSR
RRQ
RW
1.70
1.60
1.75
2.20
2.20
2.60
2.50
1.80
1.70
2.08
0.89
0.67
0.75
1.29
1.21
1.38
1.31
0.72
0.78
1.16
Participants
More than
Range of
M
SD
50 participantsa
sample size
108
27
54
61
215
2765
6,241
140
276
139
147
17
40
76
292
4584
11,746
122
494
101
50%
0%
33%
33%
92%
100%
92%
83%
67%
83%
1–461
3–48
8-165
4–250
29–1,031
114–13,803
47–31,038
40–466
1–649
28–386
Participants: surveys excludedb
M
SD
108
27
54
61
215
2,765
4,151
140
143
139
147
17
40
76
292
4,584
9,684
122
187
101
Note. LSHSS = Language, Speech, and Hearing Services in Schools; AJSLP = American Journal of Speech, Language Pathology; JSLHR = Journal of Speech, Language, and Hering; JCD = Journal of Communication Disorders, JLD = Journal of
Learning Disabilities; JREE = Journal of Research for Educational Effectiveness; JEP = Journal of Education Psychology;
SSR = Scientific Study of Reading; RRQ = Reading Research Quarterly; and RW = Reading and Writing.
Refers to the percentage of studies reporting more than 50 participants; bRepresents the mean number of participants without
the studies that involved only surveyed participants.
.
a
312
Contemporary Issues
in
Communication Science
and
Disorders • Volume 43 • 306–317 • Fall 2016
Table 2. Summary of participant characteristics in the 120 articles.
Typical populationsa
LSHSS
AJSLP
JSLHR
JCD
JLD
JEP
SSR
RRQ
JREEd
RW
Infant toddlerb
Preschoolc
K–12
College age
Adults
18%
33%
8%
17%
0%
8%
0%
0%
8%
0%
9%
8%
8%
17%
0%
0%
17%
0%
0%
0%
73%
8%
33%
42%
83%
33%
83%
83%
75%
67%
0%
0%
17%
0%
17%
25%
0%
8%
8%
17%
0%
50%
33%
25%
0%
33%
0%
8%
0%
17%
58%
25%
42%
25%
16%
92%
100%
100%
83%
75%
The percentage of articles that included participants who were typically developing; bthe percent of articles that included
participants who were under 3 years of age; cthe percentage of articles that included participants who were 3–5 years of age;
d
JREE does not equal 100% because one study did not include human subject participants.
a
Table 3. Proportion of analyses used in 120 randomly selected articles from 2013 to 2014.
Qualitative
LSHSS
AJSLP
JSLHR
JCD
JLD
JEP
SSR
RRQ
JREEa
RW
Single case
or case study
Traditional
Advanced statistics
n
%
n
%
n
%
n
%
0
2
0
2
0
0
0
1
0
1
0
17
0
17
0
0
0
8
0
8
2
2
0
0
0
0
0
1
0
0
17
17
0
0
0
0
0
8
0
0
8
7
11
9
7
5
9
5
0
8
67
58
92
75
58
42
75
42
0
67
2
1
1
1
5
7
3
5
9
3
17
8
8
8
42
58
25
42
75*
25
JREE does not equal 100%. The other three randomly selected articles represented two types not captured in the above categories. Two (17%) were meta-analyses and one (8%) was a simulation with a methodological question.
a
tion disorders (LSHSS, AJSLP, JSLHR, JCD) demonstrated a lower proportion of advanced statistics
but employed a variety of designs (i.e., qualitative,
single case, traditional, and advanced). The majority
of the sampled articles employed traditional methods
(58%–92% of the time), most commonly reporting
descriptive data, ANOVAs, or regression analyses.
built replication into their design in order to replicate
their own findings in a sequence of studies and one
reported that a primary aim was to replicate another
existing study.
Randomization and Replication
Key Findings
Random assignment was not a predominating characteristic of the research studies reviewed, with only
22% using random assignment. Most of the studies
did not report that the research was an attempt to
replicate a previous study or did not include replication in their methods (92%). Of the studies in the
review pool that addressed replication (n = 10), nine
Discussion
Based on trends in the randomly selected sample of
120 journal articles, multidimensional methods of
analysis were commonly used, particularly in articles
published in JREE, JLD, JEP, RW, RRQ, and SSSR.
The common use of multidimensional methods seen
here is consistent with trends reported in the literature, as noted by one methodologist: “Multilevel and
Wood et al.: Language and Literacy Research
313
hierarchical modeling through various types of linear
mixed models has rapidly become a required asset in
the statistical toolkit of researchers worldwide” (Garson, 2013, p. 23). Trends in the current review suggest
that the data sets used in the studies varied largely in
terms of size, with 62% including 50 or more participants. Articles from journals in speech-language
pathology showed lower average sample sizes.
Across all journals, random assignment was the
exception rather than the norm. Studies used a wide
variety of types of design and statistical method.
Advanced statistical methods that considered multiple
levels of analysis and dimensions of sampling were
more prevalent in journals that were not specifically
within the field of speech-language pathology (e.g.,
JREE, JEP, RRQ, JLD, RW, and SSR).
It is not surprising that studies from selected
journals in speech-language pathology showed smaller
sample sizes on average. The focus on low-incidence populations in speech-language pathology may
partially explain the tendency toward small data sets.
Consistent with this explanation, the out-of-field journals showed a higher percentage of articles pertaining
to typical populations. Recruiting children and youth,
particularly from low-incidence populations, may
require innovative collaborations across institutions
in order to access larger data sets. Further, the nature
of our interest in populations with unique characteristics may impact the types of methods used, in that
employing true experimental designs with random
assignment may be challenging given that participant
characteristics cannot be assigned and treatment often
cannot be ethically withheld.
commonly focused on furthering research competencies
for consumers or producers of evidence-based practices. Further, there are not many examples of large
open-access or multistate data sets available in the
extant literature; however, the concept is aligned with
Heilmann, Miller, and Nockerts’ (2010) description of
the establishment of large language-sample databases.
In this example, the establishment of multistate data
using common language-sample protocols allowed
for replication and ultimately a discriminant function
analysis to validate the use of language sample measures in classifying children’s language status.
Other potential change mechanisms noted in
the literature include (a) fostering the establishment
of mavens with content knowledge in our field, (b)
providing social support such as setting up a special
interest group on advanced statistics and methodology, and/or (c) creating blogs on statistical design
(Sharpe, 2013). Leaders in the field of speechlanguage pathology have noted the critical need
for building partnerships between highly qualified
researchers and school-based SLPs to conduct welldesigned high-quality intervention studies focused
on improving children’s language and literacy needs
(Nippold, 2015).
Becoming producers or consumers of rigorous
research using more complex statistics and methodology is difficult, if not impossible, without access to
comprehensible professional development. It would
not be surprising to find that well-intentioned lifelong learners have purchased an advanced methods
book hoping for professional guidance, only to find
that they cannot easily digest it without a guide or
interpreter. In response, increased offerings of short
courses or research translation sessions may be considered that focus on innovative statistics and methodology with relevant application to our field and
practices.
There may be no immediate or simple solutions
that emerge from the literature. In brainstorming
possible options to overcome challenges of working
with low-incidence populations, it would seem important for us to leverage incentives to build a useful
body of evidence. National organizations could offer
incentives for individual clinicians to register item
responses on commonly administered assessments in
order to facilitate the formation of national data sets
that might have utility for difficult-to-answer research
questions. It may also be beneficial for national
organizations to have designated mavens or information brokers who are skilled at bundling evidence
into meaningful, manageable information packages
to facilitate use in clinical practices, as findings of
rigorous research may otherwise go ignored without
translation (Mullen, 2005).
Mechanisms of Change
One of the intentions of this article was to generate
discussion of ways to enhance leadership training in
order to prepare next-generation scholars to ensure
rigor in research in our field. The results illuminate
trends, similarities, and differences in the methods
used in scholarly journals related to language and
literacy. The discrepancies between journals highlight
the need to ensure that scholars in speech-language
pathology are poised to be consumers and producers
of research who employ diverse methods in order to
be competitive for external funding.
Although the identification of levers of change is
beyond the scope of this article, one of the aims was
to generate such discussion. There are many possible
mechanisms of change; among them, professional
development, combining forces to establish large data
sets, and Big Data at a multistate level. Although
professional development is widely available in communication disorders programs, it is perhaps less
314
Contemporary Issues
in
Communication Science
and
Disorders • Volume 43 • 306–317 • Fall 2016
Although the role of SLPs in language and
literacy research may be highly regarded, our field
risks being left behind if there are lags in the time it
takes us to incorporate novel statistics and methodology into our research repertoire. Some authors have
suggested that preparation for scientific rigor begins
at the undergraduate curriculum so as to optimally
prepare students for scientific careers in our field
(Koehnke, McNeil, Chapman, Folsom, & Nunez,
2014). Research experiences and course work in
methodology and statistics should begin early in students’ career paths but also be present as a constant
focus throughout training and beyond.
Each decade also brings waves of innovations in
research practices. SEM, HLM, and advances in statistical software bring new opportunities for rigorous
research designs. In response, continuing education
opportunities are needed to help CSD faculty compete
in the research-funding climate. Within a relatively
short time of obtaining a degree or even a terminal
degree comes the sweeping realization that nothing
is terminal about one’s understanding of research
design, methodology, and statistics. Indeed, whether
one’s desire is to produce high-quality research or to
be a wise consumer of it, knowledge of best practices
in research can quickly become archaic without continuing education opportunities to stay current.
The end goal in advocating for reliable evidence
is to better inform policy and practice. With their
sails set on random assignment, massive data sets,
and replication, researchers in educational and CSD
research may benefit from strategic preparation to
poise themselves to take full advantage of the new
options to promote rigorous research.
Study Limitations
It cannot be assumed that the randomly selected
sample of journal articles is representative of the
collective set of articles in each journal or representative of the typical proportion of articles using each
type of design. We may have derived different trends
if we had characterized every article in each issue of
each journal. Also, it should be noted that the current
review is not an exhaustive review of all journals
pertaining to language and literacy. Notably, there
are numerous other journals that could be included to
expand the review; however, we felt these journals to
be of interest as flagship journals.
Implications
Despite limitations, the trends in the findings of the
review support the need to prepare future faculty
and scholars in speech-language pathology to be
proficient consumers and producers of a variety of
research designs and statistical methodology. Doctoral
programs in speech-language pathology may want
to carefully consider the research-related competencies to include a range of qualitative, single-case
method, and multilevel models. Based on the trends,
particularly in JEP, JREE, RRQ and SSR, there is
regular use of SEM and hierarchical methods. The
current review suggests that studies published in
ASHA journals may be shifting a bit in the use of
advanced methods as well, as noted by the use of
multilevel models in some of the articles reviewed. It
may be difficult to consider intensive systems change
in preparing doctoral students for use of a range of
methods when doctoral students in CSD programs are
already in short supply. In response, mechanisms of
change may need to be considered.
References
American Speech-Language-Hearing Association. (2005a).
Evidence-based practice in communication disorders:
Position statement and technical report. Retrieved from
www.asha.org/policy doi:10.1044/policy
American Speech-Language-Hearing Association.
(2005b). Shortages in special education and U. S. Office
of Related Services focus on new coalition—Shortages
outstrip those in math, science. Rockville, MD: Author.
August, D., & Shanahan, T. (Eds.). (2006). Developing
literacy in second-language learners: Report of the National Literacy Panel on Language-Minority Children and
Youth. Mahwah, NJ: Erlbaum.
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008).
Mixed-effects modeling with crossed random effects for
subjects and items. Journal of Memory and Language,
59, 390–412.
Branum-Martin, L., Tao, S., & Garnaat, S. (2015). Bilingual phonological awareness: Reexamining the evidence
for relations within and across languages. Journal of
Educational Psychology, 107(1), 111–125.
Compton, D. L, Miller, A. C., Gilbert, J. K., & Steacy,
L. M. (2013). What can be learned about the reading comprehension of poor readers through the use of
advanced statistical modeling techniques? In B. Miller,
L. E. Cutting, & P. McCardle (Eds.), Unraveling reading
comprehension: Behavioral, neurobiological, and genetic
components (pp 135–147). Baltimore, MD: Brookes.
Ebrahim, S., Sohani, Z., Montoya, L., Agarwal, A.,
Thorlund, K., Mills, E. J., & Ioannidis, J. P. A.
(2014). Reanalyses of randomized clinical trial data.
Journal of the American Medical Association, 312(10),
1024–1032. doi:10.1001/jama.2014.9646
Garson, G. D. (2013). Fundamentals of hierarchical linear
and multilevel modeling. In D. Garson (Ed.), Hierarchical linear modeling: Guide and applications (pp. 3–25).
Thousand Oaks, CA: Sage.
Wood et al.: Language and Literacy Research
315
Heilmann, J. J., Miller, J. F., & Nockerts, A. (2010).
Large language sample databases. Language, Speech, and
Hearing Services in Schools, 41, 84–95.
Neimark, E. D., & Estes, W. K. (1967). Stimulus sampling
theory. San Francisco, CA: Holdenday.
Nippold, M. (2015). Call for studies in implementation
science: Improving reading comprehension in school-age
children. Language, Speech, and Hearing Services in
Schools, 46, 65–67. doi:1044/2015_LSHSS-15-0010
Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom,
S., & Wolery, M. (2005). The use of single-subject
research to identify evidence-based practice in special
education. Exceptional Children, 71(2) 165–179. doi:10.1
177/001440290507100203
No Child Left Behind (NCLB) Act of 2001, 10 U. S. C. A.
6301 et seq. (West 2003).
Ioannidis, J. (2005). Why most published research findings
are false. PLoS Medicine, 2(8), 696–701. doi:10.1371/
journal.pmed.0020124
Reinhart, A. L., Haring, S. H., Levin, J. R., Patall, E.
A., & Robinson, D. H. (2013). Models of not-so-good
behavior: Yet another way to squeeze causality and
recommendations for practice out of correlational data.
Journal of Educational Psychology, 105, 241–247.
Kline, R. B. (2015). Principles and practice of structural
equation modeling (4th ed.) New York, NY: Guilford
Press.
Rodgers, J. L. (2010). The epistemology of mathematical
and statistical modeling: A quiet methodological revolution. American Psychologist, 65, 1–12. doi:10.1037/
a0018326
Koehnke, J., McNeil, M. Chapman, K., Folsom, R. C.,
& Nunez, L. (2014, April). Addressing the PhD shortage. Paper presented at the conference of the Council
on Academic Programs in Communication Sciences and
Disorders. Retrieved from www.capcsd.org/conference/
2014Handouts/Addressing_the_PhD_Shortage_April11_
2014.pdf
Sharpe, D. (2013). Why the resistance to statistical innovations? Bridging the communication gap. Psychological
Methods, 18(4), 572–582. doi:10.1037/a0034177
McCurtin, A., & Roddam, H. (2012). Evidence-based
practice: SLTs under siege or opportunity for growth?
The use and nature of research evidence in the profession. International Journal of Communication Disorders,
47(1), 11–26. doi:10.1111/j.14606984.2011.00074.x
Contact author: Carla Wood, Florida State University,
Communication Disorders, 201 W Bloxham, Tallahassee, FL
32309. Email: carla.wood@cci.fsu.edu
Mullen, R. (2005, November 8). Survey tests members’
understanding of evidence-based practice. The ASHA
Leader, 10, pp. 4–14. doi:10.1044:/leader.AN.10152005.4
316
Contemporary Issues
in
Communication Science
and
Disorders • Volume 43 • 306–317 • Fall 2016
Appendix. Description of Constructs for Coding
Advanced was used to describe quantitative studies that
utilized more complex statistical analyses including path
analysis, item response theory, and approaches that used
multiple dimensions of sampling, such as structural equation modeling and multilevel modeling, and meta-analyses.
Quantitative was used to describe studies that employed
any form of quantitative methodology, using probability statistics to make inferences or draw conclusions. Studies that
employed only surveys for data collection were excluded
from this category.
Atypical sample was defined as a sample selected to target
specifically members of an atypical population. Samples
that were not specified to target specifically atypical populations were labeled as typical samples.
Random assignment was defined as statistically-random assignment of groups or manipulations.
Design was used to describe the methodological procedures
employed to address the purpose of the article and was categorically labeled. Categories included single group: single
case/case study; single group: manipulation; single group:
observation; multiple group: observation; multiple group:
manipulation without random assignment; and multiple
group: manipulation with random assignment.
Manipulation was defined as the experimenter changing or
controlling some variable(s) in the research design.
Most advanced analysis was defined as the most advanced
statistical analysis employed in the study. Possible classifications were: 1) Qualitative; 2) Single Case/Case Study; 3)
Traditional; 4) Advanced.
Multiple group was used to describe any design with two
or more groups.
Replication was defined as the explicitly identified repeating of a previous research investigation with the intent to
cross-examine conclusions obtained in that previous study.
Research was defined as any design including systematic
evaluation of data, excluding meta-analyses Single-Case or
Small n was used to describe any single-group design with
less than 5 participants or where participants served as their
own controls.
Research entities were counted as the separate institutions
listed within each research article as having participated in
the investigation through supporting or employing one or
more of the authors.
Sample size was defined as the total number of individuals
participating in the investigation.
Multiple studies was the label assigned to research articles
including more than one explicitly identified research study.
Single case/case study was used to describe quantitative
studies that examined one or a few subjects using either
single case research design or case study design;
Single group was defined as a research design with one
group for the entire duration of the research project.
Number of groups was defined as the number of clusters or
groups in which participants belonged or were placed.
Survey was used to describe studies that employed only
questionnaires or script-based interviews for data collection.
Observation was defined as the experimenter obtaining
data without changing or controlling some variable in the
research design.
Traditional was used to describe quantitative studies that
utilized only OLS-based or equivalent nonparametric statistical analyses, such as regression, multiple regression,
ANOVA, t test, Mann-Whitney, Pearson product–moment
correlation, Spearman’s rank order correlation.
Population was categorically labeled and was determined
by the age of the participants included in the sample.
Population categories included: infant/toddler, Pre-K, K-12,
college students, adults, geriatric (ages 60+), or mixed.
Qualitative was used to describe studies that employed only
qualitative methodology, such as open-ended interviewing
to describe and develop themes, rather than quantitative
methodology.
Type of study was defined categorically as the type of research conducted. Categories included: qualitative, quantitative, survey, and mixed.
Typical sample was defined as a sample following a conventional and predictable pattern based on the majority of
the general population.
Wood et al.: Language and Literacy Research
317
Copyright of Contemporary Issues in Communication Science & Disorders is the property of
National Student Speech Language Hearing Association and its content may not be copied or
emailed to multiple sites or posted to a listserv without the copyright holder's express written
permission. However, users may print, download, or email articles for individual use.
Social Impact of Scholarly Articles in a Citation Network
Jose A. García, Rosa Rodriguez-Sánchez, and Joaquín Fdez-Valdivia
Departamento de Ciencias de la Computación e I. A., CITIC-UGR, Universidad de Granada, 18071 Granada,
Spain. E-mail: {jags, rosa, jfv}@decsai.ugr.es
The intent of this article is to use cooperative game
theory to predict the level of social impact of scholarly
papers created by citation networks. Social impact of
papers can be defined as the net effect of citations on a
network. A publication exerts direct and indirect influence on others (e.g., by citing articles) and is itself influenced directly and indirectly (e.g., by cited articles). This
network leads to an influence structure of citing and
cited publications. Drawing on cooperative game theory,
our research problem is to translate into mathematical
equations the rules that govern the social impact of a
paper in a citation network. In this article, we show that
when citation relationships between academic papers
function within a citation structure, the result is social
impact instead of the (individual) citation impact of each
paper. Mathematical equations explain the interaction
between papers in such a citation structure. The equations show that the social impact of a paper is affected
by the (individual) citation impact of citing publications,
immediacy of citing articles, and number of both citing
and cited papers. Examples are provided for several
academic papers.
Introduction
Garfield (1955) proposed a citation index for the sciences
(a list of papers along with the articles that cite them) to
improve the scholarly communication process. He also suggested the possibility of using citations as a measure of the
impact of a scholarly article within its research field.
In this context, Garfield and Sher (1963) presented results
concerning the citation behavior of research literature in
1961. It was shown that when plotting citation frequency
(e.g., the number of times a paper is cited), a small subset of
those papers receive the majority of citations.
The evaluation of the impact of scholarly articles aims to
identify the most influential works within research fields.
Received April 19, 2013; revised October 4, 2013; accepted October 4,
2013
© 2014 ASIS&T • Published online 9 May 2014 in Wiley Online Library
(wileyonlinelibrary.com). DOI: 10.1002/asi.23156
In a situation, for example, in which the only influential
articles are those that achieve a given number of citations, a
published manuscript not receiving the minimum number of
citations is assumed to be of less impact. However, the effect
of citation relationships between academic papers (i.e., cites
and is cited by) cannot be gauged in advance except in the
roughest terms. It can easily happen that the structure of
these citation relationships in a citation network conceals a
bias in the distribution of influence among articles unsuspected and unobserved by those measuring the impact of
scholarly papers.
In this context, Garner (1967) was among the first to
consider citation analysis as a kind of network study. Small
(1973) provides one of the first applications of citation networks. Currently, few would not see citation analysis as a
form of applied graph theory, as suggested by Hu, Rousseau,
and Chen (2012), which exemplifies the state of the art using
this approach. The intent of this article is to use an economic
model of cooperative game theory to predict the level of
social impact of scholarly papers created by specific citation
networks. Lucio-Arias and Scharnhorst (2012) connect this
proposal with the history of citation network analysis from a
mathematical approach.
Hereafter, “social impact” refers to the net effect of citations on a network: A scholarly paper exerts direct and
indirect influence on other articles (by citing articles and by
articles that cite citing articles) and is itself influenced
directly and indirectly (by references and references of references). This citation network leads to an influence structure of citing and cited publications (i.e., the set of influence
relations between the manuscripts in a given citation
network). In our model of social impact, a cooperative game
is a game where groups of articles (coalitions) may enforce
cooperative behavior to gain further recognition and relevance, hence the game is a competition between coalitions
of papers, rather than between individual articles. Here, we
assume that articles choose which coalitions to form following the citation relationships in a citation network. These
coalitions will be composed of (direct and indirect) citing
and cited articles.
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 66(1):117–127, 2015
A cooperative game is given by specifying a value for
every coalition of articles. Formally, the game consists of a
finite set of articles N , called the grand coalition, and a
characteristic function v from the set of all possible coalitions of papers to a set of payments that satisfies that the
value of the empty set is zero, v(0/ ) = 0. The function
describes how much collective relevance (payoff) a set of
articles can gain by forming a coalition, and the game is
sometimes called a value game or a profit game. Recall that
the players (articles) are assumed to choose which coalitions
to form based on the citation relationships in a citation
network (i.e., direct and indirect citing and cited papers).
The challenge is then to allocate the collective relevance
v( N ) among the individual articles in some fair way. A
solution concept is a vector that represents the allocation to
each player in the game. In game theory, researchers have
proposed different solution concepts based on different
notions of fairness. Some properties to look for in a solution
concept include (see Aumann & Hart, 2002, for further
details) efficiency, individual rationality, existence, uniqueness, computational ease, symmetry, additivity, and zero
allocation to null players. For instance, the Shapley value is
the unique payoff vector that is efficient, symmetric, additive, and assigns zero payoffs to dummy players (Shapley,
1953).
Drawing on cooperative game theory, our research
problem is to translate into mathematical equations the
rules governing the social impact of a paper in a citation
network. We also want to study the differences in value
between the (individual) citation impact of a paper and its
social impact in a citation network. Thus, in this article, we
prove that when citation relationships between academic
papers (cites and is cited by) function within a citation
structure, the result is social impact instead of the (individual) citation impact of each paper (i.e., times cited). To
this aim, we derive mathematical equations explaining the
interaction between papers in such a citation structure. The
equations will show that the social impact of a paper is
affected by (individual) citation impact of citing publications, immediacy of citing articles, and number of both
citing and cited papers.
This theory of social impact of scholarly papers proves
that the greater the number of citing publications in a citation network, the greater the social impact. Immediacy takes
into account how direct or indirect was the influence that a
citing publication exerts on other papers by citing articles,
by citing citing articles, and so on. The derived equations
illustrate that there is more social impact when the citing
publications are highly cited papers, when the citing action
is more immediate, and when there is a greater number of
citing articles.
But here we also uncover another rule of social impact of
scholarly papers, which is divisions of impact in a citation
structure. This rule states that the number of cited articles
also plays a role in social impact. That is, the greater number
of cited publications in a citation structure causes the social
impact to be divided among all of the cited publications.
118
Drawing on an economic model of cooperative game
theory, the social impact theory of scholarly articles is both
a generalizable and a specific theory. It uses one set of
equations that are applicable to many citation relationships.
Social impact theory of scholarly papers is also useful,
because it can be used to understand which citation relationships between academic papers result in the greatest
impact. Hence, social impact theory explores citation relationships and can help predict the outcomes of such citation
relationships finding more accurate ways to measure social
impact, understanding the role of each element in a citation
network.
In summary, in this article, we show the use of cooperative game theory to attack the problem of the measurement
of impact of a scholarly manuscript in a citation network.
The main contributions are:
• A mathematical treatment for the network approach to citation
analysis
• The concept of “social impact” of an article and its measure
that yields an effective way to analyze the influence of papers
in a citation network
• The comparative study of the citation impact of a paper and its
social impact
The following section proves that without taking account
of the influence structure (in a citation network) on the
manuscripts, the value of social impact of a paper equals its
individual citation impact. Next, the Social Impact in a Citation Network section analyzes the social impact of scholarly
articles taking account of the influence structure in a citation
network. It predicts a shift in individual impact; that is, the
citation impact of a paper is equally spread over itself and
its superiors in the influence structure given by a citation
network. A mathematical model of the cooperative game
theory allows the exact calculation of the factor of social
impact that results in such a case. The comparative value of
social impact and (individual) citation impact of scholarly
articles are analyzed using a set of experiments in the Social
Impact in a Citation Network section. Finally, we summarize the main conclusions of this work.
The “Social Impact” of Scholarly Articles
There are papers of high citation impact that retain the
power to wield single-handed influence within their research
field. At the same time, most of the scholarly articles are
often individually of low citation impact as measured by the
number of received citations (see, e.g., Garfield & Sher,
1963; Redner, 1998).
Generally, the measurement of the influence of a scholarly article is carried out based on a large number of manuscripts with low citation impact. However, if papers form
coalitions based on citation relationships, their collective
impact may be great enough to achieve a significant influence. In that case, the question is how to attribute to each
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2015
DOI: 10.1002/asi
article its “fair share” of collective impact for the different
coalitions (which arise from citation relationships) in a citation network.
This problem falls naturally into the realm of cooperative
game theory (Aumann & Hart, 2002). There, too, we have
players who can form coalitions and gain by doing so; some
coalitions may gain more than others. The question there,
too, is how to divide the benefits of the cooperation among
the players. Depending on the criteria we want our solution
to satisfy, we get different solutions.
In our problem, a cooperative game consists of a set of
scholarly articles N and a characteristic function v that
describes the worth (joint impact) that every coalition
(subset) of articles could normally obtain in a citation
network.
Because interactions among all the possible coalitions of
papers may be complex, we simplify by assuming that the
cooperative possibilities of the game can be described by the
function v that assigns a number v(S ) to every coalition S
of published papers, with the value of the empty subset
being zero, v(0/ ) = 0.
What each coalition S of scholarly articles can achieve
on its own regarding the joint impact, its worth v(S ),
depends on the complementaries between the individual
impact of the manuscripts in S . In this article, we consider
that the cooperative game is additive, and the worth of coalition S is then as follows:
v(S ) = ∑ wi
i ∈S
(1)
where wi denotes the number of received citations of manuscript i in S (i.e., its citation impact). That is, following
Garfield (1955), it is assumed that the individual impact of
each article i ∈ N is represented by the number of received
citations wi (citation impact). Now, by means of this cooperative game, we study the distribution of the impact within
the set of scholarly articles.
For instance, Table 1 shows a possible coalition S of
articles in N . In this example, N is the set of articles
published in academic journals—included in the Web of
Science (WoS)—during 2012.
The coalition S of scholarly articles illustrated in Table 1
is composed of four papers, S = {1, 2, 3, 4}, and the worth of
S is the sum of (individual) citation impacts of the papers in
the coalition:
v(S ) = ∑ wi = 3 + 2 + 5 + 9 = 19
i ∈S
(2)
TABLE 1.
An example of a coalition of scholarly articles.
Paper
1
2
3
4
Title
Authors
Which Are the Best
Performing Regions in
Information Science in
Terms of Highly Cited
Papers? Some
Improvements of Our
Previous Mapping
Approaches
Percentile Ranks and the
Integrated Impact
Indicator (I3)
The New Excellence
Indicator in the World
Report of the SCImago
Institutions Rankings 2011
Basic Properties of Both
Percentile Rank Scores
and the I3 Indicator
Times
cited, wi
L. Bornmann &
L. Leydesdorff
3
L. Leydesdorff &
L. Bornmann
2
L. Bornmann, F. de
Moya-Anegón,
L. Leydesdorff
5
R. Rousseau
9
scholarly articles N ; and (b) an additive characteristic function v, v({i}) = wi for all i in N , determined by the worth
(joint impact) of coalitions of articles (i.e., the sum of [individual] citation impacts of the papers in the coalition).
This definition does not take account of the influence
structure in a citation network (the set of influence relations
between the manuscripts in a given citation network) beyond
the individual citation impacts that were summed to obtain
the worth of coalitions. However, it can easily happen that
the structure of these citation relationships in a citation
network conceals a bias in the distribution of influence
among articles. This will be analyzed in the following
section (see Social Impact in a Citation Network section).
Let S and P be two coalitions of papers; where S may
be a subset of P. Following Harsanyi (1959), the game v can
then be expressed as
v( P ) =
∑
Δ v (S ) ⋅ uS (P ); P ⊆ N
(3)
S ⊆ N :S =/ 0/
where the quantity Δ v (S ) is referred to as the dividend of
coalition S in game of influence (v, N ), and where uS
denotes the unanimity game (i.e., a game in which coalition
S is trying to maximize the joint impact of another coalition
P only if coalition S is a subset of P ) given by
uS (P ) = 1 if S ⊆ P ; 0 otherwise.
(4)
where the wi values were calculated using the citations
received by the articles in the WoS database (accessed
January 2013).
We can now define the concept of “game of influence” as
follows:
From Equation (3), to analyze the game of influence
(v, N ) properly, we study its behavior on the collection of
all unanimity games uS , where S ⊆ N : S =/ 0/ . Because
the unanimity games form a basis, Equation (3) is uniquely
determined.
To this end, we first need the following result.
Definition 1: Game of influence. A game of influence
(v, N ) is a cooperative game that consists of: (a) the set of
Proposition 1: Dividends of a game of influence. In a
game of influence for measuring the impact of scholarly
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2015
DOI: 10.1002/asi
119
manuscripts, the dividend
S ⊆ N : S =/ 0/ , is given by
⎧wi
Δ v (S ) = ⎨
⎩0
Δ v (S )
of
coalition
if S = {i} for some i ∈ N ⎫
⎬
otherwise.
⎭
S,
Paper
(5)
where wi is the number of received citations of manuscript
i ∈ N (its citation impact).
Proof. See Proof of Proposition 1 section in the
Appendix.
From Proposition 1, it follows that the only coalitions of
papers that have positive dividends are those composed of
one manuscript. This is a direct consequence of Definition 1
that does not take into account the set of influence relations in
a citation network (beyond the individual citation impacts).
We have that manuscripts with low individual citation
impact (as given by the number of received citations) may be
influential via coalitions in a citation network. A payoff
x = {x i (v); i ∈ N } for the game of influence (v, N ) is a
correspondence that associates with each manuscript i a
possible payment for being involved in the game. It charges
each manuscript its fair share of coalitional influence (which
we are looking for). Hence a payoff x for the game (v, N )
provides an estimation of the cooperative influence of
scholarly articles in N .
In game theory, Shapley (1953) defined a value for games
to be a function that assigns to each game v a payoff xi(v) for
each i ∈ N , which can be described by
x i (v) =
Δ v (S )
S
S ⊆ N :i ∈S
∑
(6)
with S being the cardinal of subset S, where Δ v (S )
denotes the dividend of coalition S.
In our problem, the Shapley value is a unique function
that satisfies three axioms: (symmetry axiom) manuscripts
that are treated identically by the game v be treated
identically by the value xi(v); (carrier axiom) the sum
of xi(v) over all manuscripts i in any N equals v( N );
and (additivity axiom) for any games v and w,
xi(v + w) = xi(v) + xi(w). Also, Driessen (1988) proved that
the Shapley value of a superadditive game (i.e., a game
such that v(S ) + v(T ) ≤ v(S ∪ T ) for any S , T ⊆ N ) is
individually rational.
Based on the Shapley value of a game of influence, we
can now define the concept of “social impact” of a manuscript as follows:
Definition 2: Social impact of scholarly articles. Given a
game of influence (v, N ), the social impact SIv(i) of a manuscript i ∈ N is
SI v (i ) =
Δ v (S )
.
S
S ⊆ N :i ∈S
∑
(7)
The following result is a direct consequence of the definition of a game of influence given in Definition 1. It shows
120
TABLE 2. Social impact and times cited (citation impact) of manuscripts
in S (see Table 1).
1
2
3
4
Social impact, SIv(i)
Times cited
3
2
5
9
3
2
5
9
that if citation relationships between academic papers (i.e.,
cites and is cited by) do not give rise to an influence structure
on scholarly articles, the social impact is equal to the
(individual) citation impact of each paper.
Proposition 2. In a game of influence (v, N ) as given in
Definition 1, the social impact of a manuscript i equals its
individual citation impact (times cited wi):
SI v (i ) = wi , with i ∈ N .
(8)
Proof. It simply follows from substituting Equation (5)
in the social impact SIv(i) of a manuscript as given in
Definition 2.
Following Proposition 2, Table 2 shows the values of
social impact for the manuscripts in coalition S = {1, 2, 3, 4}
given in Table 1.
As discussed earlier, to achieve the result given by Proposition 2, we assumed that there is no influence structure on
the articles that can bias social impact. However, it is unrealistic because a citation network would cause an influence
structure on scholarly articles, as demonstrated in the following section.
Social Impact in a Citation Network
The influence structure in a citation network (the set of
influence relations between the manuscripts in a given
citation network) may go beyond the individual citation
impacts that were summed to obtain the worth of earlier
coalitions. In this case, it can easily happen that the structure
of these citation relationships in a citation network conceals
a bias in the distribution of influence among articles unsuspected and unintended by the methods that were used to
measure the impact of scholarly papers in the previous
section.
Following Hu et al. (2012), we have that, within a citation
network, a manuscript exerts direct and indirect influence on
other papers (by citing articles and by articles that cite citing
articles), and is itself influenced directly and indirectly (by
references and references of references). This leads to an
influence structure of citing and cited publications.
This section studies the social impact of scholarly
articles, taking into account the influence structure in a citation network. It predicts a shift in individual impact; that is,
the citation impact of a paper is equally spread over itself
and its superiors in the influence structure given by a citation
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2015
DOI: 10.1002/asi
FIG. 1. All descendants and superiors of paper 2 in the influence structure
O (see Table 3).
FIG. 2. All descendants and superiors of paper 3 in the influence structure
O (see Table 3).
network. Again, a mathematical model of the cooperative
game theory allows the exact calculation of the factor of
social impact in this case.
Influence Structure in a Citation Network
First, we need to define how the citation relationships
(i.e., cites and is cited by) in a given citation network may
give rise to an influence structure on scholarly articles. We
assume that all references have already been published by an
academic journal before being cited by another manuscript
of N .
Definition 3: Influence structure in a citation network.
Let a direct descendant of paper i be a manuscript that cites
i. Given a citation network, the influence structure on the set
N of manuscripts is a mapping O on N such that O(i),
with i ∈ N, is the set of direct descendants of i in the citation
network.
Therefore, O(i), with i ∈ N, is the subset of manuscripts
where paper i is being cited. Given an influence structure O,
we have that the collection of all (direct and indirect)
descendants of the published article i, name after D(i),
defines the transitive closure of the influence structure O.
That is, the descendants of manuscript i are the subset of
(direct and indirect) citing articles of the manuscript i.
In the following, we also denote by
D −1 (i ) = {k ∈ N | i ∈ D(k )}
(9)
the set of all superiors of manuscript i ∈ N in the influence
structure O on N . The superiors of manuscript i are the
subset of direct and indirect references of the manuscript i.
Figures 1 through 4 show the collection of all descendants (transitive closure of O) and superiors for the manuscripts in the coalition S = {1, 2, 3, 4}, which is illustrated in
Table 1. Again, in this example, N is the set of articles
published in academic journals—included in the WoS—
during 2012. The set of descendants of coalition S, that is,
FIG. 3. All descendants and superiors of paper 1 in the influence structure
O (see Table 3).
D({1, 2, 3, 4}), is simply given by D(S) = D(1) ∪ D(2) ∪
D(3) ∪ D(4).
Table 3 shows the set of papers and their descendants for
this coalition S of example: {1, 2, 3, 4} ∪ D({1, 2, 3, 4}).
For each paper, this table also illustrates the accession
number and times cited in the WoS database (accessed
January 2013).
In Figures 1 through 4, using a directed graph, we illustrate the descendants and superiors of each paper in the
coalition S = {1, 2, 3, 4}. In these four figures, each manuscript is represented by a node in the graph. A directed edge
from node i to j means that paper i is being cited by paper j.
A paper k is a descendant of i if there exists a path from i to
k in the graph. For instance, paper 5 is a descendant of paper
1 (see Figure 3): 5 ∈ D(1). Hence Figures 1 through 4 show
all descendants (and superiors) of papers 2, 3, 1, and 4, in the
influence structure O on N . These examples show a given
citation structure with the same number of nodes (papers)
and links (“is cited by”), but they focus on different papers
having a distinct level of (more or less) social influence.
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2015
DOI: 10.1002/asi
121
FIG. 5. Superiors of S: D−1(S) = D−1(1) ∪ D−1(2) ∪ D−1(3) ∪ D−1(4).
[Color figure can be viewed in the online issue, which is available at
wileyonlinelibrary.com.]
FIG. 4. All descendants and superiors of paper 4 in the influence structure
O (see Table 3).
TABLE 3. Papers and all (direct and indirect) descendants (in N ) for
coalition S = {1, 2, 3, 4}.
Paper
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Accession number
Times cited
WoS:000301364100018
WoS:000307730000017
WoS:000301364100017
WoS:000302157900016
WoS:000310550000019
WoS:000308516300002
WoS:000308581700016
WoS:000308888400013
WoS:000310550000010
WoS:000311853600006
WoS:000305233900015
WoS:000311234600025
WoS:000308581700021
WoS:000306547000013
WoS:000305434900004
WoS:000306547000017
WoS:000306547000012
WoS:000306547000018
WoS:000311515900011
3
2
5
9
0
0
0
1
0
0
1
0
0
1
0
2
1
0
0
Note. N is the set of articles published in academic journals—included
in the WoS—during 2012.
Next, we define a class of coalitions of articles that are
productive without papers outside those coalitions, because
all superiors (references) of the manuscripts in that “autonomous” coalition are also members of the coalition.
Definition 4: Autonomous collections of manuscripts. Let
O be an influence structure on N . The coalition
S ⊆ N is autonomous in the influence structure O if
D −1 (S ) ⊂ S ; with D −1 (S ) = ∪ i∈S D −1 (i ) .
Figure 5 shows the set of all superiors in N for the
coalition S = {1, 2, 3, 4} of the example given in Table 1.
A Venn diagram is used to illustrate that coalition
S = {1, 2, 3, 4} is autonomous in the influence structure O,
because D −1 (S ) ⊂ S (see Figure 6).
122
FIG. 6. Venn diagram that shows that S is autonomous in the influence
structure O: D −1 (S ) ⊂ S . [Color figure can be viewed in the online issue,
which is available at wileyonlinelibrary.com.]
We can now address the game theoretic analysis taking
account of the influence structure. There, the citation relationships impose asymmetric constraints on the manuscripts. That is, for every i, k ∈ N ,
k ∈O(i ) implies that i ∈
/ O(k ),
because we assumed that all references have already been
published before being cited by another manuscript of N ;
thus, if k ∈ O(i), we have that manuscript i was published
before being cited by manuscript k and the manuscript i
cannot be a descendant of k in the influence structure, that is,
i ∉ O(k).
Finally, we are now in conditions to define the basic
concept as follows:
Definition 5: Game with an influence structure. A game
of influence (v, N , O) is a cooperative game on the set of
scholarly articles N , where an additive characteristic
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2015
DOI: 10.1002/asi
function v, v(i) = wi for all i in N , determines the worth
(joint impact) that coalitions could normally obtain were it
not for the influence structure O on N in a citation network.
Restricted Game
In the definition of game with an influence structure
(Definition 5), the worth of articles (and their coalitions) is
still independent of the influence structure O. Following
Gilles, Owen, and van den Brink (1992), we need to
transform this game of influence (v, N , O) into a new game.
It describes all possibilities open to the scholarly articles in
the influence structure O, given their potential abilities as
described by the additive game v. The resulting game is
called restricted game as defined in the following:
Definition 6: Restricted game. Let (v, N , O) be a game of
influence with a given influence structure O. The restricted
game (vO, N ) is a cooperative game on the set of scholarly
manuscripts N , where the worth of the coalitions S ⊆ N
determines which they can achieve (in terms of collective
influence), taking account of the influence structure O:
SI vO (i ) =
Δ vO (S )
.
S
S ⊆ N :i ∈S
∑
with Δ vO (S ) being the dividend of coalition S in restricted
game (vO, N ).
The following proposition predicts a substantial shift in
(individual) citation impact of scholarly articles, when considering the influence structure O. It explains the interaction
between papers in such a citation structure. This proposition
shows that the social impact of a paper is affected by (individual) citation impact of citing papers, immediacy of citing
publications, and number of both citing and cited articles. It
is based on the social impact of a scholarly manuscript in a
restricted game (vO, N ):
Proposition 4: Shift in (individual) citation impact of
papers. In a restricted game (vO, N ), the social impact of a
scholarly article i ∈ N is given by
SI vO (i ) =
∑
k ∈{i} ∪ D ( i )
vO (S ) = v(σ (S )), for all S ⊆ N
Proposition 3: Dividends of a restricted game. In a
restricted game (vO, N ) with an influence structure O, the
dividend Δ vO (S ) of coalition S, S ⊆ N : S =/ 0/ , is given by
if S = α ({i}) for some i ∈ N ⎫
⎬
otherwise.
⎭
wk
D (k ) + 1
−1
(13)
(10)
where σ (S ) is the largest autonomous subset of S.
That is, the worth of coalition S when taking into
account the influence structure O is simply the worth (joint
impact) that the largest autonomous subset of S could
normally obtain.
In this study, we can also analyze the restricted game
(vO, N ) on the collection of all unanimity games uS as given
in Equation (4), where S ⊆ N : S =/ 0/. To this aim, we first
need a new proposition that gives the form of the dividends
of a restricted game as follows:
⎧wi
Δ vO (S ) = ⎨
⎩0
(12)
(11)
where wi is the (individual) citation impact of manuscript
i ∈ N , and with α (R ) being the smallest autonomous
coalition that contains all members of R, as well as
their superiors in the influence structure, for example,
α ({i}) = {i} ∪ D −1 (i ) .
Proof. See Proof of Proposition 3 section in the
Appendix.
Based on the Shapley value of a restricted game, we can
now define the concept of social impact of a scholarly manuscript taking account of the influence structure in a citation
network:
Definition 7: Social impact in a restricted game with an
influence structure. Given a restricted game (vO, N ), the
social impact SI vO (i ) of an article i ∈ N when taking
account of an influence structure O is
where wk is the number of received citations of article k (its
citation impact), D(i) is the set of all descendants of article
i in the influence structure, and with |D−1(k)| being the
cardinal of the set of all superiors of manuscript k.
Proof. See Proof of Proposition 4 section in the
Appendix.
This mathematical result shows that the greater the
number of citing publications in a citation network, the
greater the social impact. Also, there is more social impact
when citing publications are highly cited papers, when the
citing action is more immediate, and when there is a greater
number of citing articles. It also proves that the greater
number of cited publications in a citation structure causes
the social impact to be divided among all of the cited publications. In summary, this proposition proves the existence
of a shift in individual citation impact that arises from the
influence structure in a citation network. This is a main
result of this study because, from The “Social Impact” of
Scholarly Articles section, it follows that, without taking
account of the influence structure, the social impact value of
a manuscript equals its individual impact.
This shift in individual citation impact of scholarly
articles is best illustrated in one example as follows.
Figure 7 shows the individual citation impacts (times cited)
of articles {1, 2, 3, 4}. Following Proposition 2, without
taking account of the influence structure, the social impact
of the articles in the unrestricted game is simply given by the
respective number of received citations.
However, it is unrealistic given the citation relationships
in the citation networks, which were illustrated in Figures 1
through 4.
Figure 7 shows the social impact of the scholarly manuscripts, when considering the influence structure O, as given
in Proposition 4. It shows the substantial shift in their indi-
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2015
DOI: 10.1002/asi
123
Tables 4 through 7 show the respective computations of
the social impact values of papers 1, 2, 3, and 4, taking
account of the influence structure, which was illustrated in
Figures 1 through 5 and 8. Each table shows its descendants
k, with the corresponding accession number, times cited, and
wk
−1
D (k ) + 1 value. At the bottom of each table, the value of
social impact is computed following Proposition 4.
Conclusions
FIG. 7. Shift in individual citation impact (times cited) of scholarly
articles {1, 2, 3, 4} in the influence structure given by citation relationships
in a citation network.
FIG. 8. Descendants and superiors for papers with higher and lower
social impact.
vidual citation impact that results from the influence structure. A new figure (Figure 8) illustrates descendants and
superiors for individual papers with higher and lower social
impact. Figure 8 complements Figures 1 through 4. It can be
used to better understand the concept of social impact: (a)
the greater the number of citing publications in a citation
network, the greater the social impact would be; and (b) the
greater number of cited publications in a citation network
causes the social impact to be divided among all of the
references.
124
We have developed a formal theory for the division of
influence among scholarly manuscripts that follows from the
mathematical theory of games to social power. The concept
of “social impact” of an article and its measure yield an
effective way to analyze the impact of papers in a citation
structure. Here, social impact refers to the net effect of
citations on a network.
The use of game theory to attack the problem of the
measurement of the social impact of a scholarly manuscript
in a citation network provides a rigorous mathematical treatment for the network approach to citation analysis. This
mathematical model allows the exact calculation of the
social impact value.
Drawing on cooperative game theory, we have proved
that, if citation relationships between academic papers do
not give rise to an influence structure on scholarly articles,
the social impact value of a manuscript equals its individual
citation impact (e.g., number of received citations).
However, the influence structure in a citation network may
go beyond the individual citation impacts that were summed
to obtain the worth of coalitions of papers. Following Hu
et al. (2012), we have that, within a citation network, a
manuscript exerts direct and indirect influence on other
papers (by citing articles and by articles that cite citing
articles) and is itself influenced directly and indirectly (by
references and references of references). This leads to a
more complex influence structure of citing and cited
publications.
Again, drawing on cooperative game theory, we proved
that when citation relationships between academic papers
function within a citation structure, the result is social
impact instead of individual impact. The equations showed
how the social impact of a paper is affected by individual
impact of citing papers, immediacy of citing publications,
and number of both citing and cited articles.
This theory of the social impact of scholarly papers
proved that the greater the number of citing publications in
a citation network, the greater the social impact. Also, the
derived equations illustrated that there is more social impact
when citing publications are highly cited papers, when the
citing action is more immediate, and when there is a greater
number of citing articles. We also uncovered another rule
relating to the social impact of scholarly papers, which is
divisions of impact in a citation structure. Thus, the greater
number of cited publications in a citation structure causes
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2015
DOI: 10.1002/asi
TABLE 4.
Social impact of paper 1, with an influence structure.
Paper 1 (times cited: 3; |D−1(1)| = 1)
Descendant
2
3
5
6
7
8
9
10
11
∑
Accession number
Times cited (wk)
|D−1(k)|
WoS:000307730000017
WoS:000301364100017
WoS:000310550000019
WoS:000308516300002
WoS:000308581700016
WoS:000308888400013
WoS:000310550000010
WoS:000311853600006
WoS:000305233900015
2
5
0
0
0
1
0
0
1
3
2
6
4
2
5
5
4
4
Accession number
Times cited (wk)
|D−1(k)|
WoS:000310550000019
WoS:000308888400013
0
1
6
5
Accession number
Times cited (wk)
|D−1(k)|
WoS:000307730000017
WoS:000310550000019
WoS:000308516300002
WoS:000308888400013
WoS:000310550000010
WoS:000311853600006
WoS:000305233900015
2
0
0
1
0
0
1
3
6
4
5
5
4
4
wk
D −1 (k ) + 1
0.5
1.67
0
0
0
0.17
0
0
0.2
2.53
k ∈D (1)
Social impact: SI vo (1) = ∑ k ∈{1}∪D (1)
TABLE 5.
wk
3
=
+ 2.53 = 4.03
D −1 (k ) + 1 1 + 1
Social impact of paper 2, with an influence structure.
Paper 2 (times cited: 2; |D−1(2)| = 3)
Descendant
5
8
∑
wk
D −1 (k ) + 1
0
0.17
0.17
k ∈D ( 2 )
Social impact: SI vo (2) = ∑ k ∈{2}∪D ( 2 )
TABLE 6.
wk
2
=
+ 0.17 = 0.67
D −1 (k ) + 1 3 + 1
Social impact of paper 3, with an influence structure.
Paper 3 (times cited: 5; |D−1(3)| = 2)
Descendant
2
5
6
8
9
10
11
∑
wk
D −1 (k ) + 1
k ∈D ( 3 )
Social impact: SI vo (3) = ∑ k ∈{3}∪D (3)
0.5
0
0
0.17
0
0
0.2
0.87
wk
5
=
+ 0.87 = 2.53
D −1 (k ) + 1 2 + 1
the social impact to be divided among all of the cited
publications.
To illustrate this main result, the comparative value of the
social impact of a scholarly article with and without taking
account of the influence structure in a citation network has
also been analyzed in a set of experiments. Assuming there
is no asymmetric constraint on the scholarly articles
imposed by some influence structure, the social impact of
papers in the unrestricted game is simply given by the
respective number of received citations (its individual
citation impact). For instance, papers 1, 2, 3, and 4 in
the example have a social impact of 3, 2, 5, 9, respectively
(see Table 2).
Instead, scholarly manuscripts exhibit a substantial shift
in (individual) citation impact that results from the influence
structure. In this case, the same papers 1, 2, 3, and 4 have a
social impact of 4.03, 0.67, 2.53, and 13.74 (see Tables 4
through 7).
The intent of this article is to present a novel theory of
social impact of scholarly articles. As regards future
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2015
DOI: 10.1002/asi
125
TABLE 7.
Social impact of paper 4, with an influence structure.
Paper 4 (times cited: 9; |D−1(4)|=0)
Descendant
1
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
∑
Accession number
Times cited (wk)
|D−1(k)|
WoS:000301364100018
WoS:000307730000017
WoS:000301364100017
WoS:000310550000019
WoS:000308516300002
WoS:000308581700016
WoS:000308888400013
WoS:000310550000010
WoS:000311853600006
WoS:000305233900015
WoS:000311234600025
WoS:000308581700021
WoS:000306547000013
WoS:000305434900004
WoS:000306547000017
WoS:000306547000012
WoS:000306547000018
WoS:000311515900011
3
2
5
0
0
0
1
0
0
1
0
0
1
0
2
1
0
0
1
3
2
6
4
2
5
5
4
4
3
1
2
5
6
10
9
12
wk
D −1 (k ) + 1
1.5
0.5
1.67
0
0
0
0.17
0
0
0.2
0
0
0.33
0
0.29
0.09
0
0
4.74
k ∈D ( 4 )
Social impact: SI vo (4) = ∑ k ∈{4}∪D ( 4 )
wk
9
=
+ 4.74 = 13.74
D −1 (k ) + 1 0 + 1
directions, we are developing a publicly available suite of
web-based tools designed to calculate the social impact
value of academic papers. In addition, we will provide an
interface for the analysis of social impact, which will be
freely available to the scientific community.
Of course, the selection of the Shapley value of a game of
influence to define the concept of social impact is a limitation of the proposed method because other solution concepts
could be applied to our problem where cooperative game
theory implies a strategy and an intention from the actors,
that is, the “articles” and the “citations.” Future work
includes the analysis of alternative solution concepts to the
Shapley value that was used in this paper. Going forward, we
will also consider several questions; for example, what is the
role of self-citations, and how does this influence the joint
impact and the individual share? Can this method be used in
evaluation practices, and what are the limitations?
Acknowledgments
Garfield, E. (1955). Citation Indexes for Science: A new dimension in documentation through association of ideas. Science, 122(3159), 108–111.
Garfield, E., & Sher, I.H. (1963). New factors in the evaluation of scientific
literature through citation indexing. American Documentation, 14(3),
195–201.
Garner, R. (1967). A computer oriented, graph theoretic analysis of citation
index structures. In B. Flood (Ed.), Three Drexel Information Science
Research Studies (pp. 3–46). Philadelphia, PA: Drexel Press.
Gilles, R.P., Owen, G., & van den Brink, R. (1992). Games with permission
structures: The conjunctive approach. International Journal of Game
Theory, 20, 277–293.
Harsanyi, J.C. (1959). A bargaining model for cooperative n-person games.
In A. W. Tucker & R. D. Luce (Eds.), Contributions to the Theory of
Games IV (pp. 325–355). Princeton, NJ: Princeton University Press.
Hu, X.J., Rousseau, R., & Chen, J. (2012). Structural indicators in citation
networks. Scientometrics, 91, 451–460.
Lucio-Arias, D., & Scharnhorst, A. (2012). Mathematical approaches to
modeling science from an algorithmic-historiography perspective.
Understanding Complex Systems, 23–66.
Redner, S. (1998). How popular is your paper? An empirical study of the
citation distribution. The European Physical Journal B Condensed Matter
and Complex Systems, 4(2), 131–134.
Shapley, L.S. (1953). A value for n-person games. In H. W. Kuhn & A. W.
Tucker (Eds.), Contributions to the Theory of Games II (pp. 307–317).
Princeton, NJ: Princeton University Press.
Small, H. (1973). Co-citation in the scientific literature: A new measure of
the relationship between two documents. Journal of the American
Society for Information Science, 24(4), 265–269.
This research was sponsored by the Spanish Board for
Science and Technology under grant TIN2010-15157
cofinanced with European FEDER funds. Sincere thanks
are given to the reviewers for their, constructive
suggestions.
Appendix
References
Proof of Proposition 1
Aumann, R., & Hart, S. (Eds.). (2002). Handbook of Game Theory (Vol. 1).
Handbooks in Economics Series No. 11. North-Holland, Elsevier
Science B.V., Amsterdam, The Netherlands.
Driessen, T. (1988). Cooperative Games, Solutions and Applications.
Kluwer Academic Publishers, Dordrecht, The Netherlands.
126
From Harsanyi (1959), the dividends Δ v (S ) are given by
Δ v (S ) =
∑ (−1)
S −T
T ⊆S
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—January 2015
DOI: 10.1002/asi
v(T )
(14)
for all S, where S ⊆ N : S =/ 0/. Given that v is an additive
game, we have that
Δ v (S ) =
∑ (−1) S − T ×
T ⊆S
( )
∑ wi
i ∈T
(16)
The proof is completed by noting that the expression in
brackets vanishes except for S = 1.
Proof of Proposition 3
From the de...