Looking at the implementation in concurrent/futures/thread.py, it looks like each of the worker threads repeatedly gets one item from the queue, runs it, and then checks if the executor is being shut down. Worker threads get added dynamically until the executor's max thread count is reached. New futures cannot be submitted when the executor is being shut down. The shutdown() method waits until all workers have cleanly exited.
So it looks like the only time when submitted futures are neither executed nor cancelled is when there are more items in the work queue than there are worker threads. In this situation the worker threads just exit, and the unprocessed items will stay pending forever.
If I analyzed this correctly, perhaps we can add some functionality where leftover work items are explicitly cancelled? I think that would satisfy the OP's requirement. I *think* it would be safe to do this in shutdown() after it has set self._shutdown but before it waits for the worker threads.