Studies in the language of middle school science : a linguistic and multimodal analysis of the language used in science and social studies textbooks

Placeholder Show Content


The main aim of this dissertation was to investigate semantic discontinuity --one of the linguistic features that has been described as characteristic of science texts-- in a large corpus of middle school science and social studies textbooks (24 textbooks) and in a smaller hand-annotated corpus (39 chapters) extracted from the larger one. Semantic discontinuity refers to the purported lower frequency of discourse markers in science texts. Besides semantic discontinuity, two of the three studies in this dissertation examined the types of relationships that discourse markers --explicit and implicit-- signaled between discourse segments. Because the complexities of science texts are not limited to language, the third study presented here included a multimodal analysis of the visual representations that appeared in the hand-coded corpus. In regards to semantic discontinuity, it was found that 16.2% of the sentences in science and 14.1% of the sentences in social studies in the large corpus contained discourse markers. The difference between science and social studies was significant (p< .001). In the second study that used the smaller hand-annotated corpus, science sections had an average of 24.5% discourse markers while social studies sections had an average of 23.2% discourse markers --although this difference was not significant. These results showed that science textbooks at the middle school level did not present more semantic discontinuity when compared against their social studies counterparts. In fact, the science materials analyzed had, on average, more explicit markers than social studies in both studies. These findings also indicate that most discourse segments (i.e. clauses and sentences) in textbook materials do not contain discourse markers. Yet, the low percentages of clauses with explicit markers seem to not be limited to the language used in science textbooks. In relation to the discourse relation categories, both studies found that science texts had a higher frequency of discourse segments in an inferential relation than social studies texts. Although the first study found a significantly higher frequency of contrastive/comparison markers in social studies texts (p< .05), this difference was not significant in the hand-annotated corpus. There was a significant grade level and discourse maker interaction (p< .05) in science textbooks in the first study, in which the number of inferential markers increased in the higher grade levels. Similarly, the models run in the first study found a significant increase of contrastive markers in social studies textbooks in higher grade levels (p< .05). These differences were not significant in the second study, although this might be the result of the smaller sample size of the hand-annotated corpus. In regards to the multimodal analysis, science textbooks had a significantly higher proportion of hybrid images --images that combined realistic and abstract representations-- than social studies textbooks (p< .05). In both areas, realistic images (e.g. photographs) were the most frequent of the three categories. In terms of cross-reference links, visual representations used in science sections tended to be referred in the written material, either via descriptive or directive links, in higher percentages than in social studies sections. In social studies sections, on the other hand, visual representations were almost not referred at all in their corresponding texts. The ways captions were used in textbooks also varied in science and in social studies, whereby visual representations in science sections used more explanatory captions while social studies employed more engagement captions that mostly asked questions to students about the visual representations they were accompanying. Finally, science and social studies images were mostly used to elaborate (i.e. presented additional information) on the content presented in the written material. Although the science sections analyzed had a higher percentage of images that provided evidence to the concepts presented in the text than in social studies sections, the highest proportion of rhetorical relations in science and in social studies were elaboration relations. The findings presented in this dissertation point to the need to have adequate descriptions of the multimodal language that is used in science at different grade levels. Only then, the education community would be able develop literacy tasks that are based on the similarities and differences that exist in the language used in different subjects to present conceptual information.


Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2014
Issuance monographic
Language English


Associated with Roman, Diego Xavier
Associated with Stanford University, Graduate School of Education.
Primary advisor Hakuta, Kenji
Thesis advisor Hakuta, Kenji
Thesis advisor Osborne, Jonathan
Thesis advisor Quinn, Helen R
Thesis advisor Rohde, Hannah
Thesis advisor Valdés, Guadalupe
Advisor Osborne, Jonathan
Advisor Quinn, Helen R
Advisor Rohde, Hannah
Advisor Valdés, Guadalupe


Genre Theses

Bibliographic information

Statement of responsibility Diego Xavier Roman.
Note Submitted to the Graduate School of Education.
Thesis Thesis (Ph.D.)--Stanford University, 2014.
Location electronic resource

Access conditions

© 2014 by Diego Xavier Roman
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...