minor touchups

2015-12-17 20:00:20 +00:00 · 2015-12-17 20:00:20 +00:00 · 89ca5ba164
parent fa86945c3b
commit 89ca5ba164
1 changed files with 134 additions and 135 deletions
--- a/README.rst
+++ b/README.rst
@ -248,141 +248,6 @@ Examples and Advanced Usage
 See the `examples <https://github.com/tqdm/tqdm/tree/master/examples>`__ folder or
 import the module and run ``help()``.

-How to make a good progress bar
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-A good progress bar is a useful progress bar. To be useful, ``tqdm`` displays
-statistics and uses smart algorithms to predict and to automagically adapt to
-all your use cases with no or minimal configuration.
-
-However, there is one thing that ``tqdm`` cannot do: choose a pertinent
-progress indicator. To display a useful progress bar, it is very important that
-you ensure that you supply ``tqdm`` with the most pertinent progress indicator,
-which will reflect most accurately the current state of your program.
-Usually, a good way is to preprocess quickly to first evaluate the total amount
-of work to do before beginning the real processing.
-
-To illustrate the importance of a good progress indicator, let's take the
-following example: you want to walk through all files of a directory and
-process their contents to do your biddings.
-
-Here is a basic program to do that:
-
-.. code:: python
-
-    import os
-    from tqdm import tqdm, trange
-    from time import sleep
-
-    def dosomething(buf):
-        """Do something with the content of a file"""
-        sleep(0.01)
-        pass
-
-    def walkdir(folder):
-        """Walk through each files in a directory"""
-        for dirpath, dirs, files in os.walk(folder):
-            for filename in files:
-                yield os.path.abspath(os.path.join(dirpath, filename))
-
-    def process_content_no_progress(inputpath, blocksize=1024):
-        for filepath in walkdir(inputpath):
-            with open(filepath, 'rb') as fh:
-                buf = 1
-                while (buf):
-                    buf = fh.read(blocksize)
-                    dosomething(buf)
-
-``process_content_no_progress()`` does the job alright, but it does not show
-any information about the current progress, nor how long it will take.
-
-To quickly fix that using ``tqdm``, we can use this naive approach:
-
-.. code:: python
-
-    def process_content_with_progress1(inputpath, blocksize=1024):
-        for filepath in tqdm(walkdir(inputpath), leave=True):
-            with open(filepath, 'rb') as fh:
-                buf = 1
-                while (buf):
-                    buf = fh.read(blocksize)
-                    dosomething(buf)
-
-``process_content_with_progress1()`` will load ``tqdm()``, but since the
-iterator does not provide any length (``os.walkdir()`` cannot know the total
-files count), there is only an indication of the current and past program state,
-no prediction:
-
-``4it [00:03,  2.79it/s]``
-
-The way to get predictive information is to know the total amount of work to be
-done. Since ``os.walkdir()`` cannot give us this information, we need to
-precompute this by ourselves:
-
-.. code:: python
-
-    def process_content_with_progress2(inputpath, blocksize=1024):
-        # Preprocess the total files count
-        filecounter = 0
-        for dirpath, dirs, files in tqdm(os.walk(inputpath)):
-            for filename in files:
-                filecounter += 1
-
-        for filepath in tqdm(walkdir(inputpath), total=filecounter, leave=True):
-            with open(filepath, 'rb') as fh:
-                buf = 1
-                while (buf):
-                    buf = fh.read(blocksize)
-                    dosomething(buf)
-
-``process_content_with_progress2()`` is better than the naive approach because
-now we have predictive information:
-
-50%|██████████████████████\                      \| 2/4 [00:00<00:00,  4.06it/s]
-
-However, the progress is not smooth: it increments in steps, 1 step being
-1 file processed. The problem is that we do not just walk through files tree,
-but we process the files contents. Thus, if we stumble on one big fat file,
-it will take a huge deal more time to process than other smaller files, but
-the progress bar cannot know that, because we only supplied the files count,
-so it considers that every element is of equal processing weight.
-
-To fix that, we should use another indicator than the files count: the total
-sum of all files sizes. This would be more pertinent since the data we
-process is the files content, so there is a direct relation between size and
-content.
-
-Below we implement this approach, using a manually updated ``tqdm`` bar
-because ``tqdm`` will work on size, while the for loop works on files paths
-(so they are not at all working on the same quantity anymore):
-
-.. code:: python
-
-    def process_content_with_progress3(inputpath, blocksize=1024):
-        # Preprocess the total files sizes
-        sizecounter = 0
-        for dirpath, dirs, files in tqdm(os.walk(inputpath)):
-            for filename in files:
-                fullpath = os.path.abspath(os.path.join(dirpath, filename))
-                sizecounter += os.stat(fullpath).st_size 
-
-        # Load tqdm with size counter instead of files counter
-        with tqdm(total=sizecounter, leave=True, unit='B', unit_scale=True) as pbar:
-            for dirpath, dirs, files in os.walk(inputpath):
-                for filename in files:
-                    fullpath = os.path.abspath(os.path.join(dirpath, filename))
-                    with open(fullpath, 'rb') as fh:
-                        buf = 1
-                        while (buf):
-                            buf = fh.read(blocksize)
-                            dosomething(buf)
-                            if buf: pbar.update(len(buf))
-
-And here is the result: a much smoother progress bar with meaningful
-predicted time and statistics:
-
-47%|██████████████████▍\                    \| 152K/321K [00:03<00:03, 46.2KB/s]
-
 Hooks and callbacks
 ~~~~~~~~~~~~~~~~~~~

@ -471,6 +336,140 @@ In case you're interested in how this works (and how to modify it for your
 own callbacks), see the `examples <https://github.com/tqdm/tqdm/tree/master/examples>`__
 folder or import the module and run ``help()``.

+How to make a good progress bar
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A good progress bar is a useful progress bar. To be useful, ``tqdm`` displays
+statistics and uses smart algorithms to predict and automagically adapt to
+a variety of use cases with no or minimal configuration.
+
+However, there is one thing that ``tqdm`` cannot do: choose a pertinent
+progress indicator. To display a useful progress bar, it is very important that
+you ensure that you supply ``tqdm`` with the most pertinent progress indicator,
+which will reflect most accurately the current state of your program.
+Usually, a good way is to preprocess quickly to first evaluate the total amount
+of work to do before beginning the real processing.
+
+To illustrate the importance of a good progress indicator, let's take the
+following example: you want to walk through all files of a directory and
+process their contents to do your biddings.
+
+Here is a basic program to do that:
+
+.. code:: python
+
+    import os
+    from tqdm import tqdm, trange
+    from time import sleep
+
+    def dosomething(buf):
+        """Do something with the content of a file"""
+        sleep(0.01)
+        pass
+
+    def walkdir(folder):
+        """Walk through each files in a directory"""
+        for dirpath, dirs, files in os.walk(folder):
+            for filename in files:
+                yield os.path.abspath(os.path.join(dirpath, filename))
+
+    def process_content_no_progress(inputpath, blocksize=1024):
+        for filepath in walkdir(inputpath):
+            with open(filepath, 'rb') as fh:
+                buf = 1
+                while (buf):
+                    buf = fh.read(blocksize)
+                    dosomething(buf)
+
+``process_content_no_progress()`` does the job alright, but it does not show
+any information about the current progress, nor how long it will take.
+
+To quickly fix that using ``tqdm``, we can use this naive approach:
+
+.. code:: python
+
+    def process_content_with_progress1(inputpath, blocksize=1024):
+        for filepath in tqdm(walkdir(inputpath), leave=True):
+            with open(filepath, 'rb') as fh:
+                buf = 1
+                while (buf):
+                    buf = fh.read(blocksize)
+                    dosomething(buf)
+
+``process_content_with_progress1()`` will load ``tqdm()``, but since the
+iterator does not provide any length (``os.walkdir()`` does not have a
+``__len__()`` method for the total files count), there is only an indication
+of the current and past program state, no prediction:
+
+``4it [00:03,  2.79it/s]``
+
+The way to get predictive information is to know the total amount of work to be
+done. Since ``os.walkdir()`` cannot give us this information, we need to
+precompute this by ourselves:
+
+.. code:: python
+
+    def process_content_with_progress2(inputpath, blocksize=1024):
+        # Preprocess the total files count
+        filecounter = 0
+        for dirpath, dirs, files in tqdm(os.walk(inputpath)):
+            for filename in files:
+                filecounter += 1
+
+        for filepath in tqdm(walkdir(inputpath), total=filecounter, leave=True):
+            with open(filepath, 'rb') as fh:
+                buf = 1
+                while (buf):
+                    buf = fh.read(blocksize)
+                    dosomething(buf)
+
+``process_content_with_progress2()`` is better than the naive approach because
+now we have predictive information:
+
+50%|██████████████████████\                      \| 2/4 [00:00<00:00,  4.06it/s]
+
+However, the progress is not smooth: it increments in steps, 1 step being
+1 file processed. The problem is that we do not just walk through files tree,
+but we process the files contents. Thus, if we stumble on one big fat file,
+it will take a huge deal more time to process than other smaller files, but
+the progress bar cannot know that, because we only supplied the files count,
+so it considers that every element is of equal processing weight.
+
+To fix this, we should use another indicator than the files count: the total
+sum of all files sizes. This would be more pertinent since the data we
+process is the files' content, so there is a direct relation between size and
+content.
+
+Below we implement this approach using a manually updated ``tqdm`` bar, where
+``tqdm`` will work on size, while the ``for`` loop works on files paths:
+
+.. code:: python
+
+    def process_content_with_progress3(inputpath, blocksize=1024):
+        # Preprocess the total files sizes
+        sizecounter = 0
+        for dirpath, dirs, files in tqdm(os.walk(inputpath)):
+            for filename in files:
+                fullpath = os.path.abspath(os.path.join(dirpath, filename))
+                sizecounter += os.stat(fullpath).st_size 
+
+        # Load tqdm with size counter instead of files counter
+        with tqdm(total=sizecounter, leave=True, unit='B', unit_scale=True) as pbar:
+            for dirpath, dirs, files in os.walk(inputpath):
+                for filename in files:
+                    fullpath = os.path.abspath(os.path.join(dirpath, filename))
+                    with open(fullpath, 'rb') as fh:
+                        buf = 1
+                        while (buf):
+                            buf = fh.read(blocksize)
+                            dosomething(buf)
+                            if buf: pbar.update(len(buf))
+
+And here is the result: a much smoother progress bar with meaningful
+predicted time and statistics:
+
+47%|██████████████████▍\                    \| 152K/321K [00:03<00:03, 46.2KB/s]
+
 Contributions
 -------------