Structuring Content with HTML

HTML (HyperText Markup Language) is a markup language that tells web browsers how to structure the web pages you visit.

Properly structured it also defines the semantics of the content in a machine-readable way.

HTML consists of a series of elements (defined using tags), which you use to enclose, wrap, or mark up different parts of content to make it appear or act in a certain way.

Basic HTML Syntax

Most HTML elements have three features:

Elements can be nested within another.

For example, a word can be made bold within a paragraph.

All nested elements should be closed before the parent element is closed.

A void element consists of only a single tag.

For example, the <img> element:

<img
  src="<https://raw.githubusercontent.com/mdn/beginner-html-site/gh-pages/images/firefox-icon.png"alt="Firefox> icon" />

Some elements, such as the <img> element above, also have attributes, which provide additional information about what the element should do.

boolean attributes are attributes that do not require values. If they are included in the element, the attribute defaults to true, regardless of value, otherwise, if not present, it is false.

for example:

<!-- using the disabled attribute prevents the end user from entering text into the input box -->
<input type="text" disabled />

Anatomy of an HTML document

Lets examine a simple HTML document:

<!doctype html>
<html lang="en-US">
  <head>
    <meta charset="utf-8" />
    <title>My test page</title>
  </head>
  <body>
    <p>This is my page</p>
  </body>
</html>
  1. <!doctype html>: A historical artifact that needs to be here for the code to work.
  2. <html></html>: the root element. wraps all other elements.
    1. It is good practice to include the lang attribute in the root element, specifying the language of the page. ex: <html lang="en-US">
  3. <head></head>: A container for everything that isn't the content shown to the viewers, including keywords, titles and descriptions.
  4. <meta charset="utf-8">: represents metadata that isn’t covered by other metadata related elements. The UTF-8 part is the character encoding. Including this is a good habit. It can avoid problems.
  5. <title></title> sets the title of the page, which appears at the top of the browser.
  6. <body></body>: contains all the content that displays on the page.

Whitespace

Whitespace makes HTML more readable, but like Java, it is not syntactically significant.

Special Characters

In HTML, the following are special characters:

< > “ ‘ &

To use them in text, you must use character entities.

Character entities are special codes that begin with an ampersand and end with a semicolon.

Literal character Character entity equivalent
< <
> >
" "
' '
& &

Character entities can also be used to insert special characters.

HTML Comments

To write an HTML comment, wrap it in the special markers <!-- and -->. For example:

<p>I'm not inside a comment</p>

<!-- <p>I am!</p> -->

The <html> tag

The root element of an HTML document, represented by the <html> tag, acts as a container for all other HTML elements on the page. It should always include the lang attribute to specify the primary language of the document's content, which helps browsers and assistive technologies properly interpret and present the content.

The Head

Content in the head is not displayed in on the page of the web browser.

The elements in the head generally relate to metadata.

<title>

Represents the title of the entire page (Not to be confused with <h1> , a title in the body).

ex: <title>My Page</title>

Appears at the top of the browser window or as the title of a tab.

<meta>

metadata is data that describes the data. This element is the official way of including it in HTML.

It is useful for:

<link>

The link element is used to define a relationship between the HTML page and external resources.

<script>

The script element loads an external Javascript file to the HTML page.

This is not the only way to load Javascript to a page, but it is the most reliable way.

ex: <script src="my-js-file.js" defer></script>

The defer boolean attribute causes the Javascript to run only after the HTML page has fully loaded, to avoid Javascript referencing HTML elements that don’t yet exist.

Headings and Paragraphs

Headings and paragraphs give text structure on a webpage.

Beyond the obvious benefits of improved readability and visual appeal, having a well structured website improves its SEO and makes it more acessable to users to rely on screen readers.

The hierarchical structure defined by headings is also essential to styling a page with CSS.

For example:

<h1>The Crushing Bore</h1>

<p>By Chris Mills</p>

<h2>Chapter 1: The dark night</h2>

<p>
  It was a dark night. Somewhere, an owl hooted. The rain lashed down on the…
</p>

<h2>Chapter 2: The eternal silence</h2>

<p>
	Our protagonist could not so much as a whisper out of the shadowy figure…
</p>

<h3>The specter speaks</h3>

<p>
  Several more hours had passed, when all of a sudden the specter sat bolt
  upright and exclaimed, "Please have mercy on my soul!"
</p>

Paragraphs: <p>

Each paragraph on an HTML page needs to be wrapped in <p> tags. <p>I am a paragraph, oh yes I am.</p>

Headings: <h1> <h2><h6>

There are six different heading elements available in HTML.

This is used to create different heading levels, creating a hierarchy of content on the page.

General best practices:

Emphasis and Importance

Some elements provide emphasis to words, conveying additional meaning and making them more noticeable at a glance.

Emphasis <em>

Used to mark a word or phrase with emphasis. Styled as italics by default, but it can be configured using CSS.

Strong Importance <strong>

Like emphasis, this conveys that a word is important. Styled as bold by default, but it can be configured using CSS.

About <b><i>, and <u>

Elements like <b>, <i>, and <u> were originally created for bold, italics, or underlined text when CSS was poorly supported or unavailable. These elements are classified as presentational elements because they only affect presentation, not semantics. Semantics are critical for accessibility, SEO, and other web standards, which is why presentational elements should generally be avoided.

HTML5 redefined <b>, <i>, and <u> with new semantic roles that are not always clear.

Best practice: Use <b>, <i>, or <u> only when there isn’t a more suitable semantic element.

Quotations

<blockquote>

A block quote denotes that a block of content (a paragraph, several paragraphs, a list, etc) are quoted from elsewhere.

the cite element can link to another webpage that the quote is from.

By default, the entire block of content is rendered as indented:

<blockquote
  cite="<https://developer.mozilla.org/en-US/docs/Web/HTML/Element/blockquote>">
  <p>
    The <strong>HTML <code>&lt;blockquote&gt;</code> Element</strong> (or
    <em>HTML Block Quotation Element</em>) indicates that the enclosed text is
    an extended quotation.
  </p>
</blockquote>

Inline Quotations <q>

The inline quotation is the same as a block quote except it applies to text inside a larger paragraph.

By default, the element simply applies quotation marks around the content.

<p>
  The quote element — <code>&lt;q&gt;</code> — is
  <q cite="<https://developer.mozilla.org/en-US/docs/Web/HTML/Element/q>">
    intended for short quotations that don't require paragraph breaks.
  </q>
</p>

Abbreviations <abbr>

The <abbr> element is used to wrap around abbreviations or acronyms. You can use use the title attribute to define the abbreviation in an expansion that appears when hovering over the word.

<p>
  I think <abbr title="Reverend">Rev.</abbr> Green did it in the kitchen with
  the chainsaw.
</p>

<address>