diff --git a/website/docs/api/pipeline-functions.md b/website/docs/api/pipeline-functions.md
index 628d36000..a776eca9b 100644
--- a/website/docs/api/pipeline-functions.md
+++ b/website/docs/api/pipeline-functions.md
@@ -113,8 +113,7 @@ end of the pipeline and after all other components.
 
 Split tokens longer than a minimum length into shorter tokens. Intended for use
 with transformer pipelines where long spaCy tokens lead to input text that
-exceed the transformer model max length. See
-[managing transformer model max length limitations](/usage/embeddings-transformers#transformer-max-length).
+exceed the transformer model max length.
 
 > #### Example
 >
diff --git a/website/docs/usage/embeddings-transformers.md b/website/docs/usage/embeddings-transformers.md
index fdf15d187..7e47ac9d2 100644
--- a/website/docs/usage/embeddings-transformers.md
+++ b/website/docs/usage/embeddings-transformers.md
@@ -481,50 +481,6 @@ custom learning rate for each component. Instead of a constant, you can also
 provide a schedule, allowing you to freeze the shared parameters at the start of
 training.
 
-### Managing transformer model max length limitations {#transformer-max-length}
-
-Many transformer models have a limit on the maximum number of tokens that the
-model can process, for example BERT models are limited to 512 tokens. This limit
-refers to the number of transformer tokens (BPE, WordPiece, etc.), not the
-number of spaCy tokens.
-
-To be able to process longer texts, the spaCy [`transformer`](/api/transformer)
-component uses [`span_getters`](/api/transformer#span_getters) to convert a
-batch of [`Doc`](/api/doc) objects into lists of [`Span`](/api/span) objects. A
-span may correspond to a doc (for `doc_spans`), a sentence (for `sent_spans`) or
-a window of spaCy tokens (`strided_spans`). If a single span corresponds to more
-transformer tokens than the transformer model supports, the spaCy pipeline can't
-process the text because some spaCy tokens would be left without an analysis.
-
-In general, it is up to the transformer pipeline user to manage the input texts
-so that the model max length is not exceeded. If you're training a **new
-pipeline**, you have a number of options to handle the max length limit:
-
-- Use `doc_spans` with short texts only
-- Use `sent_spans` with short sentences only
-- For `strided_spans`, lower the `window` size to be short enough for your input
-  texts (and don't forget to lower the `stride` correspondingly)
-- Implement a [custom span getter](#transformers-training-custom-settings)
-
-You may still run into the max length limit if a single spaCy token is very
-long, like a long URL or a noisy string, or if you're using a **pretrained
-pipeline** like `en_core_web_trf` with a fixed `window` size for
-`strided_spans`. In this case, you need to modify either your texts or your
-pipeline so that you have shorter spaCy tokens. Some options:
-
-- Preprocess your texts to clean up noise and split long tokens with whitespace
-- Add a `token_splitter` to the beginning of your pipeline to break up
-  tokens that are longer than a specified length:
-
-  ```python
-  config={"min_length": 20, "split_length": 5}
-  nlp.add_pipe("token_splitter", config=config, first=True)
-  ```
-
-  In this example, tokens that are at least 20 characters long will be split up
-  into smaller tokens of 5 characters each, resulting in strided spans that
-  correspond to fewer transformer tokens.
-
 ## Static vectors {#static-vectors}
 
 If your pipeline includes a **word vectors table**, you'll be able to use the