Ayyash
Elmota

Elmota

SEO in Angular with SSR - Part II

SEO in Angular with SSR - Part II

Structuring SEO Links

Ayyash's photo
Ayyash
ยทMar 29, 2022ยท
Listen to this article

In this part, let's go through the canonical links, alternate links, and url property of the document.

Heads-up: It is a rather long article, the purpose of which is not the end result, but rather the thought process.

The final result is on StackBlitz

Abiding by Google rules and recommendations for duplicate URLs, let me build the simplest and work upwards:

For a project details page, in our single language project:

<link rel="canonical" href="https://garage.sekrab.com/projects/3" />

The only value this serves is if you have both http and https (and you really should not), your canonical link should be https.

Since the base URL for the canonical is always the live crawlable server, I'll put it aside in configuration. Also, the og:url property is recommended to have the same value.

In SeoService:

 private setUrl() {
      const url = Config.Seo.baseUrl + this.doc.location.pathname;
      // a method to update canonical (TODO)
      this.updateCanonical(url);
      // set og:url
      this.meta.updateTag({ property: 'og:url', content: url });
 }
// in Config, set baseUrl to "https://my.domain.com"

Parameters

First issue to fix is the extra dynamic parameters. This link should be reduced to its basic form without params:

/projects/2?sort=4
/projects/2;sort=3
/projects/2#something

This can be done by taking the doc.location.pathname and stripping out the matrix params:

private setUrl() {
    let url = Config.Seo.baseUrl + this.doc.location.pathname;
    if (url.indexOf(';') > -1) {
        url = url.substring(0, url.indexOf(';'));
    }
    this.updateCanonical(url);
    this.meta.updateTag({ property: 'og:url', content: url });
}

The link is initially created with no href property, and is set on every update. So we first create a private element to hold the link.

// SEO Service
export class SeoService {
  // make a private reference for the link
  private _canonicalLink: HTMLLinkElement;

  constructor(
    private title: Title,
    private meta: Meta,
    @Inject(DOCUMENT) private doc: Document
  ) {
    // ad fixed tags
    this.AddTags();
  }

  AddTags() {
    // add tags
    this.meta.addTags(Config.Seo.tags);

    // create canonical link initially without href
    this.createCanonicalLink();
  }

  private createCanonicalLink() {
    // append canonical to body
    const _canonicalLink = this.doc.createElement('link');
    _canonicalLink.setAttribute('rel', 'canonical');
    this.doc.head.appendChild(_canonicalLink);
    // set our private element
    this._canonicalLink = _canonicalLink;
  }

  private setUrl() {
    let url = Config.Seo.baseUrl + this.doc.location.pathname;
    if (url.indexOf(';') > -1) {
      url = url.substring(0, url.indexOf(';'));
    }
    // set attribute
    this._canonicalLink.setAttribute('href', url);

    // also set the og:url 
    this.meta.updateTag({ property: 'og:url', content: url});
  }
  // the rest 
}

Search results canonical

For better SEO according to Google, search bots should be fed with distinctive result sets based on search parameters. Filtering results, however; produces overlap. For example,

"Top 23 Chinese restaurants in San Diego, page 3"

Is a distinctive result for search bot. Filtering for "non smoking" or "currently open" produces duplicate overlap.


Digression:

The following two links

/search?category=chinese&price=low&open=true&nonsmoking=true&query=korma&location=sandiego&page=3

/search?category=chinese&price=heigh&query=korma&location=sandiego&page=3

... are not quite identical, but that does not serve SEO. To save on crawling budget, think of bots rather than humans. Feed the bot a seed page and let it paginate. So the URL should be crafted, so that all search results produce a single link:

/search?category=chinese&location=sandiego&page=3

Every website has its own purpose. You might want your site to index "top 100 non-smoking cafes in San Diego", if so, do not ignore the parameters. Your site will produce three different canonical links, one for smoking, one for non-smoking, and one without preference. You can also use sitemaps, or include links around the site, for the non-smoking results, to increase their ranking.

The third link has duplicates however. The best way to avoid it, is to provide users with a mandatory pre-list of a certain filter, which guarantees a smaller subset of results. Not only is it better SEO, but also better experience. Having to search millions of records for "Grisham crime" titles, is worse experience than: "Fiction - crime," first, then search for "Grisham". But every website has its own purpose.

Another enhancement is to set a prominent parameter a part of the URL. In this case, category:

/search/chinese?location=sandiego&page=3

For best practices, as well, use proper language in parameters, rather than ids. So the above is better than

/search/3424?location=4544&page=3

Which also means the category param name, and display name, should be available:

// category
{
   id: '3242423', // optional
   key: 'chinese', // programmatic key
   value: 'Chinese food' // display value
}

Digression ends


Back to our simple project. We need to rewrite to include some of the matrix params we initially stripped out. With this end result in mind:

https://garage.sekrab.com/projects?category=turtles&page=1

PS. Not going to cater for /projects/turtles/?page=1, which is recommended, to stay within scope.

In the list page where results are fetched, we need to change it to send everything:

this.seoService.setSearchResults(342, 'Turtles', 'turtles', 1);

Okay, let's step back and organize our models because that looks messy.

// search list params model
export interface IListParams {
   total: number;
   page: number;
   category?: ICategory; // expected to be retrieved
}
// category model
export interface ICategory {
   id?: string; // optional for better db indexing
   key?: string; // used as url param
   value?: string; // used for display purposes
}
// our project model
export interface IProject {
    id: string;
    title: string;
    description?: string;
    image?: string;
    category?: ICategory; // this is now modeled
}

In the search component, the result params are passed back

 ngOnInit(): void {

    this.projects$ = of(projects).pipe(
      map((projects) => {
        // assuming search occurs on url params, or query params.
        // the result set should include exact category
        const params: IListParams = {
          total: 234,
          page: 1,
          category: { key: 'turtles', value: 'Turtles' },
        };

        this.seoService.setSearchResults(params);
        return projects;
      })
    );
  }

Let's rewrite the function that sets search results SEO

setSearchResults(params: IListParams) {
    // use params.total and params.category.value for title and description

    this.setTitle(
      toFormat(RES.SEO_CONTENT.PROJECT_RESULTS_TITLE, params.total, params.category.value)
    );
    this.setDescription(
      toFormat(RES.SEO_CONTENT.PROJECT_RESULTS_DESC, params.total, params.category.value)
    );

    // pass params as is
    this.setUrl(params);
    this.setImage();
  }

So the setUrl now accepts an optional argument:

private setUrl(params?: IListParams) {
    let url = Config.Seo.baseUrl + this.doc.location.pathname;
    if (url.indexOf(';') > -1) {
      url = url.substring(0, url.indexOf(';'));

      // if category or page exist, append them as query params
      // the result should look like this
      // "https://garage.sekrab.com/projects?category=turtles&page=1"
       if (params) {
        const s = new URLSearchParams();
        params.category && s.append('category', params.category.key);
        params.page && s.append('page', params.page.toString());
        url += '?' + s.toString();
      }
    }
    // set attribute and og:url
    this._canonicalLink.setAttribute('href', url);
    this.meta.updateTag({ property: 'og:url', content: url });
}

Changing category to an object, reflects on project title as well:

  setProject(project: IProject) {
    // set title
    this.setTitle(
      toFormat(RES.SEO_CONTENT.PROJECT_TITLE,project.title,project.category.value)
    );
    //... the rest
  }

Side note: if the canonical link has queryParams and we still want to use routeParams, our component must be setup to listen to both. If it is too complicated, and even though the guidelines do not speak against matrix params, I suggest you keep the ones you want to crawl as queryParams

Bot click verses Href

Google bot promises to load dynamic content and crawl it, but with a proper href attribute on an a tag. To cater for that, all routerLink attributes should be applied on a links. For pagination, a click is caught to update the page dynamically, without changing URL, and the next page URL is supplied to href attribute. Then the click is canceled, which bots do not see.

In the component

@Component({
  template: `
  ... add link
  <a (click)="next($event)" [href]="seoLink">Next</a>
 `,
  changeDetection: ChangeDetectionStrategy.OnPush,
})
export class ProjectListComponent implements OnInit {
  // define seo link
  seoLink: string;
  ngOnInit(): void {
    this.projects$ = of(projects).pipe(
      map((projects) => {
        const params: IListParams = {
          total: 234,
          page: 1,
          category: { key: 'turtles', value: 'Turtles' },
        };
        // here, update the seo link, this needs to be done only once for SSR
        this.seoLink =  this.seoService.url +`;category=${results.category.key};page=${results.page + 1}`;

        this.seoService.setSearchResults(params);
        return projects;
      })
    );
  }

  next(clickEvent: MouseEvent) {
    // go to next page here...

    // then cancel click
    clickEvent.preventDefault();
  }
}

So in SeoService let me add the url getter:

  get url(): string {
    let url = this.doc.location.pathname;
    // clean out the matrix params
    if (url.indexOf(';') > -1) {
      url = url.substring(0, url.indexOf(';'));
    }
    return url;
  }

We can impose more design rules, create common functions and mappers, to contain parameters, but it is out of scope of this article. (may be one Tuesday?)

Default and fallback

Just like we set page title on route event NavigationEnd, we are going to set the canonical as well. So setPageTitle was obviously the wrong name of the method.

// SeoService, rename setPageTitle to setPage
setPage(title: string) {
    // set to title if found, else fall back to default
    this.setTitle(RES.PAGE_TITLES[title] || RES.DEFAULT_PAGE_TITLE);

    // also reset canonical
    this.setUrl();
  }

When it comes to multilingual, it is either interface-only, or data as well. According to Google localization guidelines, when data is multilingual, the results produced are different, thus the pages are not duplicates, the canonical link thus is not unique.

If UI is only translated, but the content is not, the pages are identical, thus there must be one default canonical link. Each language served must also point to all other alternative languages of the content.

Language and regions

While the language tells the bot what language the interface or content is written in, region tells it which region the content is being served for. It can be as simple as en, and as wild as: en-GB, en-US, en-DE, en-SA... etc.

To cater for all possible regions and languages we can set an x-default. So the one alternate link we know for sure looks like this

<link rel="alternate" href="https://[default-subdomain].baseurl/[default-language]/link" hreflang="x-default" />

The subdomain is a recommended way to serve regions, but it should not be used as a search parameter. A user living in Spain (ES) could be searching for cafes in Dubai (AE), with English language as his browser default (en). Google in such case, would produce this result:

"34 Cafes in Dubai - Sekrab Site." where hreflang=en-ES.

Some of the purposes "region" serves, for example ES:

  • The content default language is Spanish - user can change this
  • The default currency used is Euro - user can change this
  • The main page shows recommendations in Spain
  • The books shown are legal to sell in Spain
  • The items shown can be delivered in Spain

In SEO links, that looks like this:

<link rel="alternate" href="https://es.baseurl/en/cafes?city=dubai" hreflang="en-ES" />

Code-wise, we don't supply all subdomains and languages. Let's start with an odd combination:

  • I serve my content in four languages (en, es, de, fr)
  • I give special attention to two regions (es, mx)

The extreme case we can target, produce the following alternate links

<link rel="alternate" href="https://es.baseurl/en/link" hreflang="en-ES" />
<link rel="alternate" href="https://es.baseurl/de/link" hreflang="de-ES" />
<link rel="alternate" href="https://es.baseurl/fr/link" hreflang="fr-ES" />
<link rel="alternate" href="https://es.baseurl/es/link" hreflang="es-ES" />
<link rel="alternate" href="https://mx.baseurl/en/link" hreflang="en-MX" />
<link rel="alternate" href="https://mx.baseurl/de/link" hreflang="de-MX" />
<link rel="alternate" href="https://mx.baseurl/fr/link" hreflang="fr-MX" />
<link rel="alternate" href="https://mx.baseurl/es/link" hreflang="es-MX" />

<!-- default for other regions -->
<link rel="alternate" href="https://www.baseurl/en/link" hreflang="en" />
<link rel="alternate" href="https://www.baseurl/de/link" hreflang="de" />
<link rel="alternate" href="https://www.baseurl/fr/link" hreflang="fr" />
<link rel="alternate" href="https://www.baseurl/es/link" hreflang="es" />
<!-- default for all other languages, serve English version -->
<link rel="alternate" href="https://www.baseurl/en/link" hreflang="x-default" />

PS. Google shall find you, whether you include all links or not. This only serves the purpose of protecting your content of theft, but Google, always finds you!

As it builds up, it causes header pollution. In a less automated way, we can remove the ones that are too specific. For example, I am pretty sure (correct me if I am wrong), that German people in Spain and Mexico, speak the same language.

<link rel="alternate" href="https://es.baseurl/en/link" hreflang="en-ES" />
<link rel="alternate" href="https://es.baseurl/es/link" hreflang="es-ES" />
<link rel="alternate" href="https://mx.baseurl/en/link" hreflang="en-MX" />
<link rel="alternate" href="https://mx.baseurl/es/link" hreflang="es-MX" />

<!-- default for other regions and languages -->
<link rel="alternate" href="https://www.baseurl/en/link" hreflang="en" />
<link rel="alternate" href="https://www.baseurl/de/link" hreflang="de" />
<link rel="alternate" href="https://www.baseurl/fr/link" hreflang="fr" />
<link rel="alternate" href="https://www.baseurl/es/link" hreflang="es" />
<!-- default for all other languages, serve English version -->
<link rel="alternate" href="https://www.baseurl/en/link" hreflang="x-default" />

Alternate links is an array, we will make available in Service, in order to append to and reset. In SeoService:

export class SeoService {
  // add reference to all alternate link to update later
  private _alternate: HTMLLinkElement[] = [];
  constructor(
    private title: Title,
    private meta: Meta,
    @Inject(DOCUMENT) private doc: Document
  ) {
    // ad fixed tags
    this.AddTags();
  }

  AddTags() {
    // ...
    // add alternate language, one at a time, here, TODO:
    forEachLanguageRegionSupported.createAlternateLink(n); 
  }
 private createAlternateLink(language?: string, region?: string) {
    // append alternate link to body
    const _link = this.doc.createElement('link');
    _link.setAttribute('rel', 'alternate');

    // if region exists, add -region
    _link.setAttribute('hreflang', language + (region ? '-'+ region : ''));

    this.doc.head.appendChild(_link);
    this._alternate.push(_link);
  }
  // .... rest
}

So we first have to place our regions and languages, in Config, something like this.

hrefLangs: [
  { region: 'ES', language: 'es' },
  { region: 'ES', language: 'en' },
  { region: 'MX', language: 'es' },
  { region: 'MX', language: 'en' },
  { language: 'de' },
  { language: 'fr' },
  { language: 'es' },
  { language: 'en' },
  { language: 'x-default'} // this will use 'en' fall back 
],

Back to our service, we need to create an alternate link for every combination.

  // in place of forEachLanguageRegionSupported
   Config.Seo.hrefLangs.forEach((n) => {
      this.createAlternateLink(n.language, n.region);
    });

So the links are set up, let's see how they get updated.

The final link is constructed like this:

https://(n.region || www).baseUrl.com/(n.language || default_language)/doc_url_without_lang

To add insult to injury, this should apply to the right hreflang link. It is easier to re-apply both attributes, so I am rewriting the original array of _alternateLinks, to have empty links. Like this:

// in SEO Service
AddTags() {
    // ...
    // add alternate language, empty
    Config.Seo.hrefLangs.forEach(() => {
      this.createAlternateLink();
    });
  }

 private createAlternateLink() {
    // append alternate link to body, with no attributes
    const _link = this.doc.createElement('link');
    _link.setAttribute('rel', 'alternate');
    this.doc.head.appendChild(_link);
    this._alternateLinks.push(_link);
  }

Then in setUrl we should set attributes of the alternate links, let's create a private method for that

private setAlternateLinks() {
    Config.Seo.hrefLangs.forEach((n, i) => {
      // TODO: this
      const url = `https://(n.region || defaultRegion).baseUrl.com/(n.language)/doc_url_without_lang`;
      const hreflang = n.language + (n.region ? '-'+ n.region : '');
      this._alternateLinks[i].setAttribute('href', 'url');
      this._alternateLinks[i].setAttribute('hreflang', 'url');
    });
  }

First, the doc_url_without_lang. If we organize all of our paths to begin with the language directory, it is good design, and SEO friendly. It guarantees that the first directory is preserved for language, and it is understood by search bots.

If in development there is no language in url, check for environment.production before you strip out.

As for language, if it is x-default, we will replace it with Config defaultLanguage. And the fall back for the region is Config defaultRegion.

      let lang = n.language;
      if (lang === 'x-default') lang = Config.Seo.defaultLanguage;

      // current path without language, is as simple as removing /en/
      const path = this.doc.location.pathname.substring(4);
      const url = `https://${n.region || Config.Seo.defaultRegion}.domain.com/${lang}/${path}`;
      // ... etc

Before we fix the last bit, the "domain.com," let's step back to the canonical link. Now the baseUrl is no longer useful as it is. Rewriting the Config first to have movable parts.

baseUrl: 'https://$0.sekrab.com/$1/$2',

To figure out the canonical, every project has its own purpose. Consider the tree scenarios:

  1. mx, es and www have very subtle differences, (like default currency, or sort order of items).
    Then fall back the canonical to one default link. So all canonical links will have www and en in the URL.
  2. Data is translated. In this case, the language is fed by the current site language.
  3. Regions have huge differences. Then the region is fed by the current site region. That would be the most extreme.

So we start there.

So, where do we get site language and region?

You can:

  • Define them in environment files (and then make multiple builds, like i18n suggests)
  • Define them in external configs (and then also make multiple builds).
  • Inject them from server. (and make a single build, this is a future post ๐Ÿ”†).

But whatever you do, do not extract them from the current URL. (Ask me why not).

So in Config:

export const Config = {
  Basic: {
    // from environment or fed by server
    language: 'es',
    region: 'mx'
  },
  // ...
}

Back to our SeoService, adjust setUrl and setAlternateLinks

   private setUrl(params?: IListParams) {
    // prefix with baseUrl and remove /en/ (make an exception for development environment)
    const path = this.doc.location.pathname.substring(4);

    let url = toFormat(
      Config.Seo.baseUrl,
      Config.Basic.region,
      Config.Basic.language,
      path
    );

    if (url.indexOf(';') > -1) {
      url = url.substring(0, url.indexOf(';'));

      // if category or page exist, append them as query params
      if (params) {
        const s = new URLSearchParams();
        params.category && s.append('category', params.category.key);
        params.page && s.append('page', params.page.toString());
        url += '?' + s.toString();
      }
    }

    // set attribute and og:url
    this._canonicalLink.setAttribute('href', url);
    this.meta.updateTag({ property: 'og:url', content: url });

    // pass the path to alternate links
    this.setAlternateLinks(path);

  }

  private setAlternateLinks(path) {
    Config.Seo.hrefLangs.forEach((n, i) => {

      let lang = n.language;
      if (lang === 'x-default') lang = Config.Seo.defaultLanguage;

      // construct the url
      const url = toFormat(
        Config.Seo.baseUrl,
        n.region || Config.Seo.defaultRegion,
        lang,
        path
      );

      // construct hreflang
      const hreflang = n.language + (n.region ? '-' + n.region : '');

      this._alternateLinks[i].setAttribute('href', url);
      this._alternateLinks[i].setAttribute('hreflang', hreflang);
    });
  }

There. Our alternate links are ready to roll.

I decided to remove og:locale since it did not seem to be flexible enough

SSR

The issue I rant into testing the app in SSR, was the duplicate link tags. The links were appended on both platforms. This is good news. We can confine link creation and update processes to server platform only. There is no immediate value of making the changes happen in a browser platform. Combine it to environment to be able to test in development.

Use PLATFORM_ID injection token, or from @angular/cdk package

// return before creating link tags, or setUrl
if (environment.production && this.platform.isBrowser) return;

The other way is more bitter. It involves removing all tags, before adding them again, each time the route updates. Not going in that direction.

The last option, is to check for existence of elements first, using querySelectorAll. Change AddTags as follows

 AddTags() {
     // ...

    // add canonical and alternate links
    const _canonical = this.doc.querySelector('link[rel="canonical"]');
    if (_canonical) {
      this._canonicalLink = _canonical as HTMLLinkElement;
    } else {
      this.createCanonicalLink();
    }

    // add alternate language, one at a time, here
    const _links = this.doc.querySelectorAll('link[rel="alternate"]');
    if (_links.length > 0) {
      this._alternateLinks = Array.from(_links) as HTMLLinkElement[];
    } else {
      Config.Seo.hrefLangs.forEach(() => this.createAlternateLink());
    }
  }

Tested. Works.

The other issue you might run into, in case your live server uses reverse proxy, the current URL give wrong results on server. It's localhost instead of the live URL. So that's one reason why you avoid getting region information from URL.

Google Search Snippets.

One more addition to make. But let's do that next week. ๐Ÿ˜ด

Thanks for reaching this far, even if you scrolled down quickly, I appreciate it. Let me know if something caught your attention.

Resources:

ย 
Share this